1. 07 May, 2019 7 commits
    • YueHaibing's avatar
      cxgb4: Fix error path in cxgb4_init_module · a3147770
      YueHaibing authored
      BUG: unable to handle kernel paging request at ffffffffa016a270
      PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bbd067 PTE 0
      Oops: 0000 [#1
      CPU: 0 PID: 6134 Comm: modprobe Not tainted 5.1.0+ #33
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:atomic_notifier_chain_register+0x24/0x60
      Code: 1f 80 00 00 00 00 55 48 89 e5 41 54 49 89 f4 53 48 89 fb e8 ae b4 38 01 48 8b 53 38 48 8d 4b 38 48 85 d2 74 20 45 8b 44 24 10 <44> 3b 42 10 7e 08 eb 13 44 39 42 10 7c 0d 48 8d 4a 08 48 8b 52 08
      RSP: 0018:ffffc90000e2bc60 EFLAGS: 00010086
      RAX: 0000000000000292 RBX: ffffffff83467240 RCX: ffffffff83467278
      RDX: ffffffffa016a260 RSI: ffffffff83752140 RDI: ffffffff83467240
      RBP: ffffc90000e2bc70 R08: 0000000000000000 R09: 0000000000000001
      R10: 0000000000000000 R11: 00000000014fa61f R12: ffffffffa01c8260
      R13: ffff888231091e00 R14: 0000000000000000 R15: ffffc90000e2be78
      FS:  00007fbd8d7cd540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa016a270 CR3: 000000022c7e3000 CR4: 00000000000006f0
      Call Trace:
       register_inet6addr_notifier+0x13/0x20
       cxgb4_init_module+0x6c/0x1000 [cxgb4
       ? 0xffffffffa01d7000
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      If pci_register_driver fails, register inet6addr_notifier is
      pointless. This patch fix the error path in cxgb4_init_module.
      
      Fixes: b5a02f50 ("cxgb4 : Update ipv6 address handling api")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3147770
    • Maxime Chevallier's avatar
      dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings · 822dd046
      Maxime Chevallier authored
      The phy_mode "2000base-x" is actually supposed to be "1000base-x", even
      though the commit title of the original patch says otherwise.
      
      Fixes: 55601a88 ("net: phy: Add 2000base-x, 2500base-x and rxaui modes")
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      822dd046
    • Harini Katakam's avatar
      net: macb: Change interrupt and napi enable order in open · 05044531
      Harini Katakam authored
      Current order in open:
      -> Enable interrupts (macb_init_hw)
      -> Enable NAPI
      -> Start PHY
      
      Sequence of RX handling:
      -> RX interrupt occurs
      -> Interrupt is cleared and interrupt bits disabled in handler
      -> NAPI is scheduled
      -> In NAPI, RX budget is processed and RX interrupts are re-enabled
      
      With the above, on QEMU or fixed link setups (where PHY state doesn't
      matter), there's a chance macb RX interrupt occurs before NAPI is
      enabled. This will result in NAPI being scheduled before it is enabled.
      Fix this macb open by changing the order.
      
      Fixes: ae1f2a56 ("net: macb: Added support for many RX queues")
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05044531
    • Stephen Suryaputra's avatar
      vrf: sit mtu should not be updated when vrf netdev is the link · ff6ab32b
      Stephen Suryaputra authored
      VRF netdev mtu isn't typically set and have an mtu of 65536. When the
      link of a tunnel is set, the tunnel mtu is changed from 1480 to the link
      mtu minus tunnel header. In the case of VRF netdev is the link, then the
      tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in
      this case.
      Signed-off-by: default avatarStephen Suryaputra <ssuryaextr@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff6ab32b
    • YueHaibing's avatar
      net: dsa: Fix error cleanup path in dsa_init_module · 68be9302
      YueHaibing authored
      BUG: unable to handle kernel paging request at ffffffffa01c5430
      PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
      Oops: 0000 [#1
      CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:raw_notifier_chain_register+0x16/0x40
      Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
      RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
      RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
      RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
      RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
      R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
      FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
      Call Trace:
       register_netdevice_notifier+0x43/0x250
       ? 0xffffffffa01e0000
       dsa_slave_register_notifier+0x13/0x70 [dsa_core
       ? 0xffffffffa01e0000
       dsa_init_module+0x2e/0x1000 [dsa_core
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Cleanup allocated resourses if there are errors,
      otherwise it will trgger memleak.
      
      Fixes: c9eb3e0f ("net: dsa: Add support for learning FDB through notification")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68be9302
    • YueHaibing's avatar
      l2tp: Fix possible NULL pointer dereference · 638a3a1e
      YueHaibing authored
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000128
      PGD 0 P4D 0
      Oops: 0000 [#1
      CPU: 0 PID: 5697 Comm: modprobe Tainted: G        W         5.1.0-rc7+ #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:__lock_acquire+0x53/0x10b0
      Code: 8b 1c 25 40 5e 01 00 4c 8b 6d 10 45 85 e4 0f 84 bd 06 00 00 44 8b 1d 7c d2 09 02 49 89 fe 41 89 d2 45 85 db 0f 84 47 02 00 00 <48> 81 3f a0 05 70 83 b8 00 00 00 00 44 0f 44 c0 83 fe 01 0f 86 3a
      RSP: 0018:ffffc90001c07a28 EFLAGS: 00010002
      RAX: 0000000000000000 RBX: ffff88822f038440 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000128
      RBP: ffffc90001c07a88 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
      R13: 0000000000000000 R14: 0000000000000128 R15: 0000000000000000
      FS:  00007fead0811540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000128 CR3: 00000002310da000 CR4: 00000000000006f0
      Call Trace:
       ? __lock_acquire+0x24e/0x10b0
       lock_acquire+0xdf/0x230
       ? flush_workqueue+0x71/0x530
       flush_workqueue+0x97/0x530
       ? flush_workqueue+0x71/0x530
       l2tp_exit_net+0x170/0x2b0 [l2tp_core
       ? l2tp_exit_net+0x93/0x2b0 [l2tp_core
       ops_exit_list.isra.6+0x36/0x60
       unregister_pernet_operations+0xb8/0x110
       unregister_pernet_device+0x25/0x40
       l2tp_init+0x55/0x1000 [l2tp_core
       ? 0xffffffffa018d000
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7fead031a839
      Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
      RSP: 002b:00007ffe8d9acca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 0000560078398b80 RCX: 00007fead031a839
      RDX: 0000000000000000 RSI: 000056007659dc2e RDI: 0000000000000003
      RBP: 000056007659dc2e R08: 0000000000000000 R09: 0000560078398b80
      R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
      R13: 00005600783a04a0 R14: 0000000000040000 R15: 0000560078398b80
      Modules linked in: l2tp_core(+) e1000 ip_tables ipv6 [last unloaded: l2tp_core
      CR2: 0000000000000128
      ---[ end trace 8322b2b8bf83f8e1
      
      If alloc_workqueue fails in l2tp_init, l2tp_net_ops
      is unregistered on failure path. Then l2tp_exit_net
      is called which will flush NULL workqueue, this patch
      add a NULL check to fix it.
      
      Fixes: 67e04c29 ("l2tp: unregister l2tp_net_ops on failure path")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Acked-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      638a3a1e
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 982e826d
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-05-06
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Two x32 JIT fixes: one which has buggy signed comparisons in 64
         bit conditional jumps and another one for 64 bit negation, both
         from Wang.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      982e826d
  2. 05 May, 2019 8 commits
    • David Ahern's avatar
      ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabled · 9b3040a6
      David Ahern authored
      Define __ipv4_neigh_lookup_noref to return NULL when CONFIG_INET is disabled.
      
      Fixes: 4b2a2bfe ("neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b3040a6
    • Eric Dumazet's avatar
      ip6: fix skb leak in ip6frag_expire_frag_queue() · 47d3d7fd
      Eric Dumazet authored
      Since ip6frag_expire_frag_queue() now pulls the head skb
      from frag queue, we should no longer use skb_get(), since
      this leads to an skb leak.
      
      Stefan Bader initially reported a problem in 4.4.stable [1] caused
      by the skb_get(), so this patch should also fix this issue.
      
      296583.091021] kernel BUG at /build/linux-6VmqmP/linux-4.4.0/net/core/skbuff.c:1207!
      [296583.091734] Call Trace:
      [296583.091749]  [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350
      [296583.091764]  [<ffffffff8183939a>] _decode_session6+0x26a/0x400
      [296583.091779]  [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50
      [296583.091795]  [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0
      [296583.091809]  [<ffffffff81824421>] icmp6_send+0x5e1/0x940
      [296583.091823]  [<ffffffff81753238>] ? __netif_receive_skb+0x18/0x60
      [296583.091838]  [<ffffffff817532b2>] ? netif_receive_skb_internal+0x32/0xa0
      [296583.091858]  [<ffffffffc0199f74>] ? ixgbe_clean_rx_irq+0x594/0xac0 [ixgbe]
      [296583.091876]  [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6]
      [296583.091893]  [<ffffffff8183d431>] icmpv6_send+0x21/0x30
      [296583.091906]  [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120
      [296583.091921]  [<ffffffffc04eb27f>] nf_ct_frag6_expire+0x1f/0x30 [nf_defrag_ipv6]
      [296583.091938]  [<ffffffff810f3b57>] call_timer_fn+0x37/0x140
      [296583.091951]  [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6]
      [296583.091968]  [<ffffffff810f5464>] run_timer_softirq+0x234/0x330
      [296583.091982]  [<ffffffff8108a339>] __do_softirq+0x109/0x2b0
      
      Fixes: d4289fcc ("net: IP6 defrag: use rbtrees for IPv6 defrag")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Cc: Peter Oskolkov <posk@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47d3d7fd
    • Christophe Leroy's avatar
      net: ucc_geth - fix Oops when changing number of buffers in the ring · ee0df193
      Christophe Leroy authored
      When changing the number of buffers in the RX ring while the interface
      is running, the following Oops is encountered due to the new number
      of buffers being taken into account immediately while their allocation
      is done when opening the device only.
      
      [   69.882706] Unable to handle kernel paging request for data at address 0xf0000100
      [   69.890172] Faulting instruction address: 0xc033e164
      [   69.895122] Oops: Kernel access of bad area, sig: 11 [#1]
      [   69.900494] BE PREEMPT CMPCPRO
      [   69.907120] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.115-00006-g179ade8ce3-dirty #269
      [   69.915956] task: c0684310 task.stack: c06da000
      [   69.920470] NIP:  c033e164 LR: c02e44d0 CTR: c02e41fc
      [   69.925504] REGS: dfff1e20 TRAP: 0300   Not tainted  (4.14.115-00006-g179ade8ce3-dirty)
      [   69.934161] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 22004428  XER: 20000000
      [   69.940869] DAR: f0000100 DSISR: 20000000
      [   69.940869] GPR00: c0352d70 dfff1ed0 c0684310 f00000a4 00000040 dfff1f68 00000000 0000001f
      [   69.940869] GPR08: df53f410 1cc00040 00000021 c0781640 42004424 100c82b6 f00000a4 df53f5b0
      [   69.940869] GPR16: df53f6c0 c05daf84 00000040 00000000 00000040 c0782be4 00000000 00000001
      [   69.940869] GPR24: 00000000 df53f400 000001b0 df53f410 df53f000 0000003f df708220 1cc00044
      [   69.978348] NIP [c033e164] skb_put+0x0/0x5c
      [   69.982528] LR [c02e44d0] ucc_geth_poll+0x2d4/0x3f8
      [   69.987384] Call Trace:
      [   69.989830] [dfff1ed0] [c02e4554] ucc_geth_poll+0x358/0x3f8 (unreliable)
      [   69.996522] [dfff1f20] [c0352d70] net_rx_action+0x248/0x30c
      [   70.002099] [dfff1f80] [c04e93e4] __do_softirq+0xfc/0x310
      [   70.007492] [dfff1fe0] [c0021124] irq_exit+0xd0/0xd4
      [   70.012458] [dfff1ff0] [c000e7e0] call_do_irq+0x24/0x3c
      [   70.017683] [c06dbe80] [c0006bac] do_IRQ+0x64/0xc4
      [   70.022474] [c06dbea0] [c001097c] ret_from_except+0x0/0x14
      [   70.027964] --- interrupt: 501 at rcu_idle_exit+0x84/0x90
      [   70.027964]     LR = rcu_idle_exit+0x74/0x90
      [   70.037585] [c06dbf60] [20000000] 0x20000000 (unreliable)
      [   70.042984] [c06dbf80] [c004bb0c] do_idle+0xb4/0x11c
      [   70.047945] [c06dbfa0] [c004bd14] cpu_startup_entry+0x18/0x1c
      [   70.053682] [c06dbfb0] [c05fb034] start_kernel+0x370/0x384
      [   70.059153] [c06dbff0] [00003438] 0x3438
      [   70.063062] Instruction dump:
      [   70.066023] 38a00000 38800000 90010014 4bfff015 80010014 7c0803a6 3123ffff 7c691910
      [   70.073767] 38210010 4e800020 38600000 4e800020 <80e3005c> 80c30098 3107ffff 7d083910
      [   70.081690] ---[ end trace be7ccd9c1e1a9f12 ]---
      
      This patch forbids the modification of the number of buffers in the
      ring while the interface is running.
      
      Fixes: ac421852 ("ucc_geth: add ethtool support")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee0df193
    • Laurentiu Tudor's avatar
      dpaa_eth: fix SG frame cleanup · 17170e65
      Laurentiu Tudor authored
      Fix issue with the entry indexing in the sg frame cleanup code being
      off-by-1. This problem showed up when doing some basic iperf tests and
      manifested in traffic coming to a halt.
      Signed-off-by: default avatarLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Acked-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17170e65
    • Colin Ian King's avatar
      net: rds: fix spelling mistake "syctl" -> "sysctl" · d14a108d
      Colin Ian King authored
      There is a spelling mistake in a pr_warn warning. Fix it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d14a108d
    • Matteo Croce's avatar
      cls_cgroup: avoid panic when receiving a packet before filter set · 594725db
      Matteo Croce authored
      When a cgroup classifier is added, there is a small time interval in
      which tp->root is NULL. If we receive a packet in this small time slice
      a NULL pointer dereference will happen, leading to a kernel panic:
      
          # mkdir /sys/fs/cgroup/net_cls/0
          # echo 0x100001 >  /sys/fs/cgroup/net_cls/0/net_cls.classid
          # echo $$ >/sys/fs/cgroup/net_cls/0/tasks
          # ping -qfb 255.255.255.255 -I eth0 &>/dev/null &
          # tc qdisc add dev eth0 root handle 10: htb
          # while : ; do
          > tc filter add dev eth0 parent 10: protocol ip prio 10 handle 1: cgroup
          > tc filter delete dev eth0
          > done
          Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
          Mem abort info:
            ESR = 0x96000005
            Exception class = DABT (current EL), IL = 32 bits
            SET = 0, FnV = 0
            EA = 0, S1PTW = 0
          Data abort info:
            ISV = 0, ISS = 0x00000005
            CM = 0, WnR = 0
          user pgtable: 4k pages, 39-bit VAs, pgdp = 0000000098a7ff91
          [0000000000000028] pgd=0000000000000000, pud=0000000000000000
          Internal error: Oops: 96000005 [#1] SMP
          Modules linked in: sch_htb cls_cgroup algif_hash af_alg nls_iso8859_1 nls_cp437 vfat fat xhci_plat_hcd m25p80 spi_nor xhci_hcd mtd usbcore usb_common spi_orion sfp i2c_mv64xxx phy_generic mdio_i2c marvell10g i2c_core mvpp2 mvmdio phylink sbsa_gwdt ip_tables x_tables autofs4
          Process ping (pid: 5421, stack limit = 0x00000000b20b1505)
          CPU: 3 PID: 5421 Comm: ping Not tainted 5.1.0-rc6 #31
          Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
          pstate: 60000005 (nZCv daif -PAN -UAO)
          pc : cls_cgroup_classify+0x80/0xec [cls_cgroup]
          lr : cls_cgroup_classify+0x34/0xec [cls_cgroup]
          sp : ffffff8012e6b850
          x29: ffffff8012e6b850 x28: ffffffc423dd3c00
          x27: ffffff801093ebc0 x26: ffffffc425a85b00
          x25: 0000000020000000 x24: 0000000000000000
          x23: ffffff8012e6b910 x22: ffffffc428db4900
          x21: ffffff8012e6b910 x20: 0000000000100001
          x19: 0000000000000000 x18: 0000000000000000
          x17: 0000000000000000 x16: 0000000000000000
          x15: 0000000000000000 x14: 0000000000000000
          x13: 0000000000000000 x12: 000000000000001c
          x11: 0000000000000018 x10: ffffff8012e6b840
          x9 : 0000000000003580 x8 : 000000000000009d
          x7 : 0000000000000002 x6 : ffffff8012e6b860
          x5 : 000000007cd66ffe x4 : 000000009742a193
          x3 : ffffff800865b4d8 x2 : ffffff8012e6b910
          x1 : 0000000000000400 x0 : ffffffc42c38f300
          Call trace:
           cls_cgroup_classify+0x80/0xec [cls_cgroup]
           tcf_classify+0x78/0x138
           htb_enqueue+0x74/0x320 [sch_htb]
           __dev_queue_xmit+0x3e4/0x9d0
           dev_queue_xmit+0x24/0x30
           ip_finish_output2+0x2e4/0x4d0
           ip_finish_output+0x1d8/0x270
           ip_mc_output+0xa8/0x240
           ip_local_out+0x58/0x68
           ip_send_skb+0x2c/0x88
           ip_push_pending_frames+0x44/0x50
           raw_sendmsg+0x458/0x830
           inet_sendmsg+0x54/0xe8
           sock_sendmsg+0x34/0x50
           __sys_sendto+0xd0/0x120
           __arm64_sys_sendto+0x30/0x40
           el0_svc_common.constprop.0+0x88/0xf8
           el0_svc_handler+0x2c/0x38
           el0_svc+0x8/0xc
          Code: 39496001 360002a1 b9425c14 34000274 (79405260)
      
      Fixes: ed76f5ed ("net: sched: protect filter_chain list with filter_chain_lock mutex")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarMatteo Croce <mcroce@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      594725db
    • Paul Bolle's avatar
      isdn: bas_gigaset: use usb_fill_int_urb() properly · 4014dfae
      Paul Bolle authored
      The switch to make bas_gigaset use usb_fill_int_urb() - instead of
      filling that urb "by hand" - missed the subtle ordering of the previous
      code.
      
      See, before the switch urb->dev was set to a member somewhere deep in a
      complicated structure and then supplied to usb_rcvisocpipe() and
      usb_sndisocpipe(). After that switch urb->dev wasn't set to anything
      specific before being supplied to those two macros. This triggers a
      nasty oops:
      
          BUG: unable to handle kernel NULL pointer dereference at 00000000
          #PF error: [normal kernel read fault]
          *pde = 00000000
          Oops: 0000 [#1] SMP
          CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-0.rc4.1.local0.fc28.i686 #1
          Hardware name: IBM 2525FAG/2525FAG, BIOS 74ET64WW (2.09 ) 12/14/2006
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: f30c9f20
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Call Trace:
           <SOFTIRQ>
           ? gigaset_isdn_connD+0xf6/0x140 [gigaset]
           gigaset_handle_event+0x173e/0x1b90 [gigaset]
           tasklet_action_common.isra.16+0x4e/0xf0
           tasklet_action+0x1e/0x20
           __do_softirq+0xb2/0x293
           ? __irqentry_text_end+0x3/0x3
           call_on_stack+0x45/0x50
           </SOFTIRQ>
           ? irq_exit+0xb5/0xc0
           ? do_IRQ+0x78/0xd0
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? common_interrupt+0xd4/0xdc
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? sched_cpu_activate+0x1b/0xf0
           ? acpi_fan_resume.cold.7+0x9/0x18
           ? cpuidle_enter_state+0x152/0x4c0
           ? cpuidle_enter+0x14/0x20
           ? call_cpuidle+0x21/0x40
           ? do_idle+0x1c8/0x200
           ? cpu_startup_entry+0x25/0x30
           ? rest_init+0x88/0x8a
           ? arch_call_rest_init+0xd/0x19
           ? start_kernel+0x42f/0x448
           ? i386_start_kernel+0xac/0xb0
           ? startup_32_smp+0x164/0x168
          Modules linked in: ppp_generic slhc capi bas_gigaset gigaset kernelcapi nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc ipw2200 iTCO_wdt gpio_ich snd_intel8x0 libipw iTCO_vendor_support snd_ac97_codec lib80211 ppdev ac97_bus snd_seq cfg80211 snd_seq_device pcspkr thinkpad_acpi lpc_ich snd_pcm i2c_i801 snd_timer ledtrig_audio snd soundcore rfkill parport_pc parport pcc_cpufreq acpi_cpufreq i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sdhci_pci sysimgblt cqhci fb_sys_fops drm sdhci mmc_core tg3 ata_generic serio_raw yenta_socket pata_acpi video
          CR2: 0000000000000000
          ---[ end trace 1fe07487b9200c73 ]---
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: cddcb3bc
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Kernel panic - not syncing: Fatal exception in interrupt
          Kernel Offset: 0xcc00000 from 0xc0400000 (relocation range: 0xc0000000-0xf6ffdfff)
          ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      No-one noticed because this Oops is apparently only triggered by setting
      up an ISDN data connection on a live ISDN line on a gigaset base (ie,
      the PBX that the gigaset driver support). Very few people do that
      running present day kernels.
      
      Anyhow, a little code reorganization makes this problem go away, while
      avoiding the subtle ordering that was used in the past. So let's do
      that.
      
      Fixes: 78c696c1 ("isdn: gigaset: use usb_fill_int_urb()")
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4014dfae
    • Heiner Kallweit's avatar
      net: phy: fix phy_validate_pause · b4010af9
      Heiner Kallweit authored
      We have valid scenarios where ETHTOOL_LINK_MODE_Pause_BIT doesn't
      need to be supported. Therefore extend the first check to check
      for rx_pause being set.
      
      See also phy_set_asym_pause:
      rx=0 and tx=1: advertise asym pause only
      rx=0 and tx=0: stop advertising both pause modes
      
      The fixed commit isn't wrong, it's just the one that introduced the
      linkmode bitmaps.
      
      Fixes: 3c1bcc86 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4010af9
  3. 04 May, 2019 14 commits
    • David Ahern's avatar
      ipmr_base: Do not reset index in mr_table_dump · 7fcd1e03
      David Ahern authored
      e is the counter used to save the location of a dump when an
      skb is filled. Once the walk of the table is complete, mr_table_dump
      needs to return without resetting that index to 0. Dump of a specific
      table is looping because of the reset because there is no way to
      indicate the walk of the table is done.
      
      Move the reset to the caller so the dump of each table starts at 0,
      but the loop counter is maintained if a dump fills an skb.
      
      Fixes: e1cedae1 ("ipmr: Refactor mr_rtm_dumproute")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fcd1e03
    • Matteo Croce's avatar
      cls_matchall: avoid panic when receiving a packet before filter set · 25426043
      Matteo Croce authored
      When a matchall classifier is added, there is a small time interval in
      which tp->root is NULL. If we receive a packet in this small time slice
      a NULL pointer dereference will happen, leading to a kernel panic:
      
          # tc qdisc replace dev eth0 ingress
          # tc filter add dev eth0 parent ffff: matchall action gact drop
          Unable to handle kernel NULL pointer dereference at virtual address 0000000000000034
          Mem abort info:
            ESR = 0x96000005
            Exception class = DABT (current EL), IL = 32 bits
            SET = 0, FnV = 0
            EA = 0, S1PTW = 0
          Data abort info:
            ISV = 0, ISS = 0x00000005
            CM = 0, WnR = 0
          user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000a623d530
          [0000000000000034] pgd=0000000000000000, pud=0000000000000000
          Internal error: Oops: 96000005 [#1] SMP
          Modules linked in: cls_matchall sch_ingress nls_iso8859_1 nls_cp437 vfat fat m25p80 spi_nor mtd xhci_plat_hcd xhci_hcd phy_generic sfp mdio_i2c usbcore i2c_mv64xxx marvell10g mvpp2 usb_common spi_orion mvmdio i2c_core sbsa_gwdt phylink ip_tables x_tables autofs4
          Process ksoftirqd/0 (pid: 9, stack limit = 0x0000000009de7d62)
          CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.1.0-rc6 #21
          Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
          pstate: 40000005 (nZcv daif -PAN -UAO)
          pc : mall_classify+0x28/0x78 [cls_matchall]
          lr : tcf_classify+0x78/0x138
          sp : ffffff80109db9d0
          x29: ffffff80109db9d0 x28: ffffffc426058800
          x27: 0000000000000000 x26: ffffffc425b0dd00
          x25: 0000000020000000 x24: 0000000000000000
          x23: ffffff80109dbac0 x22: 0000000000000001
          x21: ffffffc428ab5100 x20: ffffffc425b0dd00
          x19: ffffff80109dbac0 x18: 0000000000000000
          x17: 0000000000000000 x16: 0000000000000000
          x15: 0000000000000000 x14: 0000000000000000
          x13: ffffffbf108ad288 x12: dead000000000200
          x11: 00000000f0000000 x10: 0000000000000001
          x9 : ffffffbf1089a220 x8 : 0000000000000001
          x7 : ffffffbebffaa950 x6 : 0000000000000000
          x5 : 000000442d6ba000 x4 : 0000000000000000
          x3 : ffffff8008735ad8 x2 : ffffff80109dbac0
          x1 : ffffffc425b0dd00 x0 : ffffff8010592078
          Call trace:
           mall_classify+0x28/0x78 [cls_matchall]
           tcf_classify+0x78/0x138
           __netif_receive_skb_core+0x29c/0xa20
           __netif_receive_skb_one_core+0x34/0x60
           __netif_receive_skb+0x28/0x78
           netif_receive_skb_internal+0x2c/0xc0
           napi_gro_receive+0x1a0/0x1d8
           mvpp2_poll+0x928/0xb18 [mvpp2]
           net_rx_action+0x108/0x378
           __do_softirq+0x128/0x320
           run_ksoftirqd+0x44/0x60
           smpboot_thread_fn+0x168/0x1b0
           kthread+0x12c/0x130
           ret_from_fork+0x10/0x1c
          Code: aa0203f3 aa1e03e0 d503201f f9400684 (b9403480)
          ---[ end trace fc71e2ef7b8ab5a5 ]---
          Kernel panic - not syncing: Fatal exception in interrupt
          SMP: stopping secondary CPUs
          Kernel Offset: disabled
          CPU features: 0x002,00002000
          Memory Limit: none
          Rebooting in 1 seconds..
      
      Fix this by adding a NULL check in mall_classify().
      
      Fixes: ed76f5ed ("net: sched: protect filter_chain list with filter_chain_lock mutex")
      Signed-off-by: default avatarMatteo Croce <mcroce@redhat.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25426043
    • David Ahern's avatar
      neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit · 4b2a2bfe
      David Ahern authored
      Commit cd9ff4de changed the key for IFF_POINTOPOINT devices to
      INADDR_ANY but neigh_xmit which is used for MPLS encapsulations was not
      updated to use the altered key. The result is that every packet Tx does
      a lookup on the gateway address which does not find an entry, a new one
      is created only to find the existing one in the table right before the
      insert since arp_constructor was updated to reset the primary key. This
      is seen in the allocs and destroys counters:
          ip -s -4 ntable show | head -10 | grep alloc
      
      which increase for each packet showing the unnecessary overhread.
      
      Fix by having neigh_xmit use __ipv4_neigh_lookup_noref for NEIGH_ARP_TABLE.
      
      Fixes: cd9ff4de ("ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY")
      Reported-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b2a2bfe
    • David Ahern's avatar
      neighbor: Reset gc_entries counter if new entry is released before insert · 64c6f4bb
      David Ahern authored
      Ian and Alan both reported seeing overflows after upgrades to 5.x kernels:
        neighbour: arp_cache: neighbor table overflow!
      
      Alan's mpls script helped get to the bottom of this bug. When a new entry
      is created the gc_entries counter is bumped in neigh_alloc to check if a
      new one is allowed to be created. ___neigh_create then searches for an
      existing entry before inserting the just allocated one. If an entry
      already exists, the new one is dropped in favor of the existing one. In
      this case the cleanup path needs to drop the gc_entries counter. There
      is no memory leak, only a counter leak.
      
      Fixes: 58956317 ("neighbor: Improve garbage collection")
      Reported-by: default avatarIan Kumlien <ian.kumlien@gmail.com>
      Reported-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64c6f4bb
    • David S. Miller's avatar
      Merge branch 'ena-fixes' · f0c5bcf2
      David S. Miller authored
      Sameeh Jubran says:
      
      ====================
      Bug fixes for ENA Ethernet driver
      
      Sameeh Jubran (8):
        net: ena: fix swapped parameters when calling
          ena_com_indirect_table_fill_entry
        net: ena: fix: set freed objects to NULL to avoid failing future
          allocations
        net: ena: fix: Free napi resources when ena_up() fails
        net: ena: fix incorrect test of supported hash function
        net: ena: fix return value of ena_com_config_llq_info()
        net: ena: improve latency by disabling adaptive interrupt moderation
          by default
        net: ena: fix ena_com_fill_hash_function() implementation
        net: ena: gcc 8: fix compilation warning
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0c5bcf2
    • Sameeh Jubran's avatar
      net: ena: gcc 8: fix compilation warning · f9133088
      Sameeh Jubran authored
      GCC 8 contains a number of new warnings as well as enhancements to existing
      checkers. The warning - Wstringop-truncation - warns for calls to bounded
      string manipulation functions such as strncat, strncpy, and stpncpy that
      may either truncate the copied string or leave the destination unchanged.
      
      In our case the destination string length (32 bytes) is much shorter than
      the source string (64 bytes) which causes this warning to show up. In
      general the destination has to be at least a byte larger than the length
      of the source string with strncpy for this warning not to showup.
      
      This can be easily fixed by using strlcpy instead which already does the
      truncation to the string. Documentation for this function can be
      found here:
      
      https://elixir.bootlin.com/linux/latest/source/lib/string.c#L141
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9133088
    • Sameeh Jubran's avatar
      net: ena: fix ena_com_fill_hash_function() implementation · 11bd7a00
      Sameeh Jubran authored
      ena_com_fill_hash_function() didn't configure the rss->hash_func.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarNetanel Belgazal <netanel@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11bd7a00
    • Sameeh Jubran's avatar
      net: ena: improve latency by disabling adaptive interrupt moderation by default · 78cb421d
      Sameeh Jubran authored
      Adaptive interrupt moderation was erroneously enabled by default
      in the driver.
      
      In case the device supports adaptive interrupt moderation it will
      be automatically used, which may potentially increase latency.
      
      The adaptive moderation can be enabled from ethtool command in
      case the feature is supported by the device.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarGuy Tzalik <gtzalik@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78cb421d
    • Sameeh Jubran's avatar
      net: ena: fix return value of ena_com_config_llq_info() · 9a27de0c
      Sameeh Jubran authored
      ena_com_config_llq_info() returns 0 even if ena_com_set_llq() fails.
      Return the failure code of ena_com_set_llq() in case it fails.
      
      fixes: 689b2bda ("net: ena: add functions for handling Low Latency Queues in ena_com")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a27de0c
    • Sameeh Jubran's avatar
      net: ena: fix incorrect test of supported hash function · d3cfe7dd
      Sameeh Jubran authored
      ena_com_set_hash_function() tests if a hash function is supported
      by the device before setting it.
      The test returns the opposite result than needed.
      Reverse the condition to return the correct value.
      Also use the BIT macro instead of inline shift.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3cfe7dd
    • Sameeh Jubran's avatar
      net: ena: fix: Free napi resources when ena_up() fails · b287cdbd
      Sameeh Jubran authored
      ena_up() calls ena_init_napi() but does not call ena_del_napi() in
      case of failure. This causes a segmentation fault upon rmmod when
      netif_napi_del() is called. Fix this bug by calling ena_del_napi()
      before returning error from ena_up().
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b287cdbd
    • Sameeh Jubran's avatar
      net: ena: fix: set freed objects to NULL to avoid failing future allocations · 8ee8ee7f
      Sameeh Jubran authored
      In some cases when a queue related allocation fails, successful past
      allocations are freed but the pointer that pointed to them is not
      set to NULL. This is a problem for 2 reasons:
      1. This is generally a bad practice since this pointer might be
      accidentally accessed in the future.
      2. Future allocations using the same pointer check if the pointer
      is NULL and fail if it is not.
      
      Fixed this by setting such pointers to NULL in the allocation of
      queue related objects.
      
      Also refactored the code of ena_setup_tx_resources() to goto-style
      error handling to avoid code duplication of resource freeing.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ee8ee7f
    • Sameeh Jubran's avatar
      net: ena: fix swapped parameters when calling ena_com_indirect_table_fill_entry · 3c6eeff2
      Sameeh Jubran authored
      second parameter should be the index of the table rather than the value.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarSaeed Bshara <saeedb@amazon.com>
      Signed-off-by: default avatarSameeh Jubran <sameehj@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c6eeff2
    • Haiyang Zhang's avatar
      hv_netvsc: fix race that may miss tx queue wakeup · 93aa4792
      Haiyang Zhang authored
      When the ring buffer is almost full due to RX completion messages, a
      TX packet may reach the "low watermark" and cause the queue stopped.
      If the TX completion arrives earlier than queue stopping, the wakeup
      may be missed.
      
      This patch moves the check for the last pending packet to cover both
      EAGAIN and success cases, so the queue will be reliably waked up when
      necessary.
      Reported-and-tested-by: default avatarStephan Klein <stephan.klein@wegfinder.at>
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93aa4792
  4. 02 May, 2019 6 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ea986679
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Out of bounds access in xfrm IPSEC policy unlink, from Yue Haibing.
      
       2) Missing length check for esp4 UDP encap, from Sabrina Dubroca.
      
       3) Fix byte order of RX STBC access in mac80211, from Johannes Berg.
      
       4) Inifnite loop in bpftool map create, from Alban Crequy.
      
       5) Register mark fix in ebpf verifier after pkt/null checks, from Paul
          Chaignon.
      
       6) Properly use rcu_dereference_sk_user_data in L2TP code, from Eric
          Dumazet.
      
       7) Buffer overrun in marvell phy driver, from Andrew Lunn.
      
       8) Several crash and statistics handling fixes to bnxt_en driver, from
          Michael Chan and Vasundhara Volam.
      
       9) Several fixes to the TLS layer from Jakub Kicinski (copying negative
          amounts of data in reencrypt, reencrypt frag copying, blind nskb->sk
          NULL deref, etc).
      
      10) Several UDP GRO fixes, from Paolo Abeni and Eric Dumazet.
      
      11) PID/UID checks on ipv6 flow labels are inverted, from Willem de
          Bruijn.
      
      12) Use after free in l2tp, from Eric Dumazet.
      
      13) IPV6 route destroy races, also from Eric Dumazet.
      
      14) SCTP state machine can erroneously run recursively, fix from Xin
          Long.
      
      15) Adjust AF_PACKET msg_name length checks, add padding bytes if
          necessary. From Willem de Bruijn.
      
      16) Preserve skb_iif, so that forwarded packets have consistent values
          even if fragmentation is involved. From Shmulik Ladkani.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
        udp: fix GRO packet of death
        ipv6: A few fixes on dereferencing rt->from
        rds: ib: force endiannes annotation
        selftests: fib_rule_tests: print the result and return 1 if any tests failed
        ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
        net/tls: avoid NULL pointer deref on nskb->sk in fallback
        selftests: fib_rule_tests: Fix icmp proto with ipv6
        packet: validate msg_namelen in send directly
        packet: in recvmsg msg_name return at least sizeof sockaddr_ll
        sctp: avoid running the sctp state machine recursively
        stmmac: pci: Fix typo in IOT2000 comment
        Documentation: fix netdev-FAQ.rst markup warning
        ipv6: fix races in ip6_dst_destroy()
        l2ip: fix possible use-after-free
        appletalk: Set error code if register_snap_client failed
        net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc
        rxrpc: Fix net namespace cleanup
        ipv6/flowlabel: wait rcu grace period before put_pid()
        vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
        tcp: add sanity tests in tcp_add_backlog()
        ...
      ea986679
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190502' of git://git.kernel.dk/linux-block · 5ce3307b
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "This is mostly io_uring fixes/tweaks. Most of these were actually done
        in time for the last -rc, but I wanted to ensure that everything
        tested out great before including them. The code delta looks larger
        than it really is, as it's mostly just comment additions/changes.
      
        Outside of the comment additions/changes, this is mostly removal of
        unnecessary barriers. In all, this pull request contains:
      
         - Tweak to how we handle errors at submission time. We now post a
           completion event if the error occurs on behalf of an sqe, instead
           of returning it through the system call. If the error happens
           outside of a specific sqe, we return the error through the system
           call. This makes it nicer to use and makes the "normal" use case
           behave the same as the offload cases. (me)
      
         - Fix for a missing req reference drop from async context (me)
      
         - If an sqe is submitted with RWF_NOWAIT, don't punt it to async
           context. Return -EAGAIN directly, instead of using it as a hint to
           do async punt. (Stefan)
      
         - Fix notes on barriers (Stefan)
      
         - Remove unnecessary barriers (Stefan)
      
         - Fix potential double free of memory in setup error (Mark)
      
         - Further improve sq poll CPU validation (Mark)
      
         - Fix page allocation warning and leak on buffer registration error
           (Mark)
      
         - Fix iov_iter_type() for new no-ref flag (Ming)
      
         - Fix a case where dio doesn't honor bio no-page-ref (Ming)"
      
      * tag 'for-linus-20190502' of git://git.kernel.dk/linux-block:
        io_uring: avoid page allocation warnings
        iov_iter: fix iov_iter_type
        block: fix handling for BIO_NO_PAGE_REF
        io_uring: drop req submit reference always in async punt
        io_uring: free allocated io_memory once
        io_uring: fix SQPOLL cpu validation
        io_uring: have submission side sqe errors post a cqe
        io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP
        io_uring: remove unnecessary barrier after incrementing dropped counter
        io_uring: remove unnecessary barrier before reading SQ tail
        io_uring: remove unnecessary barrier after updating SQ head
        io_uring: remove unnecessary barrier before reading cq head
        io_uring: remove unnecessary barrier before wq_has_sleeper
        io_uring: fix notes on barriers
        io_uring: fix handling SQEs requesting NOWAIT
      5ce3307b
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · b7a5b22b
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
       "I apologize for sending these so late in the cycle. We went back and
        forth about how to deal with the unexpected logging of intentional
        link state changes and finally decided to just config them off by
        default.
      
        PCI fixes:
      
         - Stop ignoring "pci=disable_acs_redir" parameter (Logan Gunthorpe)
      
         - Use shared MSI/MSI-X vector for Link Bandwidth Management (Alex
           Williamson)
      
         - Add Kconfig option for Link Bandwidth notification messages (Keith
           Busch)"
      
      * tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI/LINK: Add Kconfig option (default off)
        PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
        PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
      b7a5b22b
    • Linus Torvalds's avatar
      Merge tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · e2a4b102
      Linus Torvalds authored
      Pull MTD fix from Richard Weinberger:
       "A single regression fix for the marvell nand driver"
      
      * tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
        mtd: rawnand: marvell: Clean the controller state before each operation
      e2a4b102
    • Keith Busch's avatar
      PCI/LINK: Add Kconfig option (default off) · 2078e1e7
      Keith Busch authored
      e8303bb7 ("PCI/LINK: Report degraded links via link bandwidth
      notification") added dmesg logging whenever a link changes speed or width
      to a state that is considered degraded.  Unfortunately, it cannot
      differentiate signal integrity-related link changes from those
      intentionally initiated by an endpoint driver, including drivers that may
      live in userspace or VMs when making use of vfio-pci.  Some GPU drivers
      actively manage the link state to save power, which generates a stream of
      messages like this:
      
        vfio-pci 0000:07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at 0000:00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link)
      
      Since we can't distinguish the intentional changes from the signal
      integrity issues, leave the reporting turned off by default.  Add a Kconfig
      option to turn it on if desired.
      
      Fixes: e8303bb7 ("PCI/LINK: Report degraded links via link bandwidth notification")
      Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.comSigned-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      2078e1e7
    • Eric Dumazet's avatar
      udp: fix GRO packet of death · 4dd2b82d
      Eric Dumazet authored
      syzbot was able to crash host by sending UDP packets with a 0 payload.
      
      TCP does not have this issue since we do not aggregate packets without
      payload.
      
      Since dev_gro_receive() sets gso_size based on skb_gro_len(skb)
      it seems not worth trying to cope with padded packets.
      
      BUG: KASAN: slab-out-of-bounds in skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
      Read of size 16 at addr ffff88808893fff0 by task syz-executor612/7889
      
      CPU: 0 PID: 7889 Comm: syz-executor612 Not tainted 5.1.0-rc7+ #96
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       __asan_report_load16_noabort+0x14/0x20 mm/kasan/generic_report.c:133
       skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
       udp_gro_receive_segment net/ipv4/udp_offload.c:382 [inline]
       call_gro_receive include/linux/netdevice.h:2349 [inline]
       udp_gro_receive+0xb61/0xfd0 net/ipv4/udp_offload.c:414
       udp4_gro_receive+0x763/0xeb0 net/ipv4/udp_offload.c:478
       inet_gro_receive+0xe72/0x1110 net/ipv4/af_inet.c:1510
       dev_gro_receive+0x1cd0/0x23c0 net/core/dev.c:5581
       napi_gro_frags+0x36b/0xd10 net/core/dev.c:5843
       tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027
       call_write_iter include/linux/fs.h:1866 [inline]
       do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681
       do_iter_write fs/read_write.c:957 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:938
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002
       do_writev+0x15e/0x370 fs/read_write.c:1037
       __do_sys_writev fs/read_write.c:1110 [inline]
       __se_sys_writev fs/read_write.c:1107 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1107
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441cc0
      Code: 05 48 3d 01 f0 ff ff 0f 83 9d 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d 51 93 29 00 00 75 14 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 74 09 fc ff c3 48 83 ec 08 e8 ba 2b 00 00
      RSP: 002b:00007ffe8c716118 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 00007ffe8c716150 RCX: 0000000000441cc0
      RDX: 0000000000000001 RSI: 00007ffe8c716170 RDI: 00000000000000f0
      RBP: 0000000000000000 R08: 000000000000ffff R09: 0000000000a64668
      R10: 0000000020000040 R11: 0000000000000246 R12: 000000000000c2d9
      R13: 0000000000402b50 R14: 0000000000000000 R15: 0000000000000000
      
      Allocated by task 5143:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_kmalloc mm/kasan/common.c:497 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
       kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
       slab_post_alloc_hook mm/slab.h:437 [inline]
       slab_alloc mm/slab.c:3393 [inline]
       kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
       mm_alloc+0x1d/0xd0 kernel/fork.c:1030
       bprm_mm_init fs/exec.c:363 [inline]
       __do_execve_file.isra.0+0xaa3/0x23f0 fs/exec.c:1791
       do_execveat_common fs/exec.c:1865 [inline]
       do_execve fs/exec.c:1882 [inline]
       __do_sys_execve fs/exec.c:1958 [inline]
       __se_sys_execve fs/exec.c:1953 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 5351:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
       __cache_free mm/slab.c:3499 [inline]
       kmem_cache_free+0x86/0x260 mm/slab.c:3765
       __mmdrop+0x238/0x320 kernel/fork.c:677
       mmdrop include/linux/sched/mm.h:49 [inline]
       finish_task_switch+0x47b/0x780 kernel/sched/core.c:2746
       context_switch kernel/sched/core.c:2880 [inline]
       __schedule+0x81b/0x1cc0 kernel/sched/core.c:3518
       preempt_schedule_irq+0xb5/0x140 kernel/sched/core.c:3745
       retint_kernel+0x1b/0x2d
       arch_local_irq_restore arch/x86/include/asm/paravirt.h:767 [inline]
       kmem_cache_free+0xab/0x260 mm/slab.c:3766
       anon_vma_chain_free mm/rmap.c:134 [inline]
       unlink_anon_vmas+0x2ba/0x870 mm/rmap.c:401
       free_pgtables+0x1af/0x2f0 mm/memory.c:394
       exit_mmap+0x2d1/0x530 mm/mmap.c:3144
       __mmput kernel/fork.c:1046 [inline]
       mmput+0x15f/0x4c0 kernel/fork.c:1067
       exec_mmap fs/exec.c:1046 [inline]
       flush_old_exec+0x8d9/0x1c20 fs/exec.c:1279
       load_elf_binary+0x9bc/0x53f0 fs/binfmt_elf.c:864
       search_binary_handler fs/exec.c:1656 [inline]
       search_binary_handler+0x17f/0x570 fs/exec.c:1634
       exec_binprm fs/exec.c:1698 [inline]
       __do_execve_file.isra.0+0x1394/0x23f0 fs/exec.c:1818
       do_execveat_common fs/exec.c:1865 [inline]
       do_execve fs/exec.c:1882 [inline]
       __do_sys_execve fs/exec.c:1958 [inline]
       __se_sys_execve fs/exec.c:1953 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff88808893f7c0
       which belongs to the cache mm_struct of size 1496
      The buggy address is located 600 bytes to the right of
       1496-byte region [ffff88808893f7c0, ffff88808893fd98)
      The buggy address belongs to the page:
      page:ffffea0002224f80 count:1 mapcount:0 mapping:ffff88821bc40ac0 index:0xffff88808893f7c0 compound_mapcount: 0
      flags: 0x1fffc0000010200(slab|head)
      raw: 01fffc0000010200 ffffea00025b4f08 ffffea00027b9d08 ffff88821bc40ac0
      raw: ffff88808893f7c0 ffff88808893e440 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808893fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88808893ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff88808893ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                                   ^
       ffff888088940000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff888088940080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      
      Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4dd2b82d
  5. 01 May, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply · 600d7258
      Linus Torvalds authored
      Pull power supply fixes from Sebastian Reichel:
       "Two more fixes for the 5.1 cycle.
      
        One division by zero fix in a specific driver and one core workaround
        for bad userspace behaviour from systemd regarding uevents. IMHO this
        can be considered to be a userspace bug, but the debug messages are
        useless anyways
      
         - cpcap-battery: fix a division by zero
      
         - core: fix systemd issue due to log messages produced by uevent"
      
      * tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
        power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
        power: supply: cpcap-battery: Fix division by zero
      600d7258
    • Wang YanQing's avatar
      bpf, x32: Fix bug for BPF_ALU64 | BPF_NEG · b9aa0b35
      Wang YanQing authored
      The current implementation has two errors:
      
      1: The second xor instruction will clear carry flag which
         is necessary for following sbb instruction.
      2: The select coding for sbb instruction is wrong, the coding
         is "sbb dreg_hi,ecx", but what we need is "sbb ecx,dreg_hi".
      
      This patch rewrites the implementation and fixes the errors.
      
      This patch fixes below errors reported by bpf/test_verifier in x32
      platform when the jit is enabled:
      
      "
      0: (b4) w1 = 4
      1: (b4) w2 = 4
      2: (1f) r2 -= r1
      3: (4f) r2 |= r1
      4: (87) r2 = -r2
      5: (c7) r2 s>>= 63
      6: (5f) r1 &= r2
      7: (bf) r0 = r1
      8: (95) exit
      processed 9 insns (limit 131072), stack depth 0
      0: (b4) w1 = 4
      1: (b4) w2 = 4
      2: (1f) r2 -= r1
      3: (4f) r2 |= r1
      4: (87) r2 = -r2
      5: (c7) r2 s>>= 63
      6: (5f) r1 &= r2
      7: (bf) r0 = r1
      8: (95) exit
      processed 9 insns (limit 131072), stack depth 0
      ......
      Summary: 1189 PASSED, 125 SKIPPED, 15 FAILED
      "
      Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b9aa0b35
    • Wang YanQing's avatar
      bpf, x32: Fix bug for BPF_JMP | {BPF_JSGT, BPF_JSLE, BPF_JSLT, BPF_JSGE} · 711aef1b
      Wang YanQing authored
      The current method to compare 64-bit numbers for conditional jump is:
      
      1) Compare the high 32-bit first.
      
      2) If the high 32-bit isn't the same, then goto step 4.
      
      3) Compare the low 32-bit.
      
      4) Check the desired condition.
      
      This method is right for unsigned comparison, but it is buggy for signed
      comparison, because it does signed comparison for low 32-bit too.
      
      There is only one sign bit in 64-bit number, that is the MSB in the 64-bit
      number, it is wrong to treat low 32-bit as signed number and do the signed
      comparison for it.
      
      This patch fixes the bug and adds a testcase in selftests/bpf for such bug.
      Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      711aef1b
    • Martin KaFai Lau's avatar
      ipv6: A few fixes on dereferencing rt->from · 886b7a50
      Martin KaFai Lau authored
      It is a followup after the fix in
      commit 9c69a132 ("route: Avoid crash from dereferencing NULL rt->from")
      
      rt6_do_redirect():
      1. NULL checking is needed on rt->from because a parallel
         fib6_info delete could happen that sets rt->from to NULL.
         (e.g. rt6_remove_exception() and fib6_drop_pcpu_from()).
      
      2. fib6_info_hold() is not enough.  Same reason as (1).
         Meaning, holding dst->__refcnt cannot ensure
         rt->from is not NULL or rt->from->fib6_ref is not 0.
      
         Instead of using fib6_info_hold_safe() which ip6_rt_cache_alloc()
         is already doing, this patch chooses to extend the rcu section
         to keep "from" dereference-able after checking for NULL.
      
      inet6_rtm_getroute():
      1. NULL checking is also needed on rt->from for a similar reason.
         Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED.
      
      Fixes: a68886a6 ("net/ipv6: Make from in rt6_info rcu protected")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      886b7a50
    • Nicholas Mc Guire's avatar
      rds: ib: force endiannes annotation · f3505745
      Nicholas Mc Guire authored
      While the endiannes is being handled correctly as indicated by the comment
      above the offending line - sparse was unhappy with the missing annotation
      as be64_to_cpu() expects a __be64 argument. To mitigate this annotation
      all involved variables are changed to a consistent __le64 and the
       conversion to uint64_t delayed to the call to rds_cong_map_updated().
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3505745