1. 31 Mar, 2018 36 commits
    • Madalin Bucur's avatar
      dpaa_eth: fix error in dpaa_remove() · 5bbb99d2
      Madalin Bucur authored
      
      [ Upstream commit 88075256 ]
      
      The recent changes that make the driver probing compatible with DSA
      were not propagated in the dpa_remove() function, breaking the
      module unload function. Using the proper device to address the issue.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5bbb99d2
    • Madalin Bucur's avatar
      soc/fsl/qbman: fix issue in qman_delete_cgr_safe() · 29cd9c2d
      Madalin Bucur authored
      
      [ Upstream commit 96f413f4 ]
      
      The wait_for_completion() call in qman_delete_cgr_safe()
      was triggering a scheduling while atomic bug, replacing the
      kthread with a smp_call_function_single() call to fix it.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarRoy Pledge <roy.pledge@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29cd9c2d
    • Arkadi Sharshevsky's avatar
      team: Fix double free in error path · 43d8f3c5
      Arkadi Sharshevsky authored
      
      [ Upstream commit cbcc607e ]
      
      The __send_and_alloc_skb() receives a skb ptr as a parameter but in
      case it fails the skb is not valid:
      - Send failed and released the skb internally.
      - Allocation failed.
      
      The current code tries to release the skb in case of failure which
      causes redundant freeing.
      
      Fixes: 9b00cf2d ("team: implement multipart netlink messages for options transfers")
      Signed-off-by: default avatarArkadi Sharshevsky <arkadis@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43d8f3c5
    • Vinicius Costa Gomes's avatar
      skbuff: Fix not waking applications when errors are enqueued · 329f4710
      Vinicius Costa Gomes authored
      
      [ Upstream commit 6e5d58fd ]
      
      When errors are enqueued to the error queue via sock_queue_err_skb()
      function, it is possible that the waiting application is not notified.
      
      Calling 'sk->sk_data_ready()' would not notify applications that
      selected only POLLERR events in poll() (for example).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarRandy E. Witt <randy.e.witt@intel.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      329f4710
    • Michal Kalderon's avatar
      qede: Fix qedr link update · e90e9771
      Michal Kalderon authored
      
      [ Upstream commit 4609adc2 ]
      
      Link updates were not reported to qedr correctly.
      Leading to cases where a link could be down, but qedr
      would see it as up.
      In addition, once qede was loaded, link state would be up,
      regardless of the actual link state.
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@cavium.com>
      Signed-off-by: default avatarAriel Elior <ariel.elior@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e90e9771
    • Florian Fainelli's avatar
      net: systemport: Rewrite __bcm_sysport_tx_reclaim() · c6841b47
      Florian Fainelli authored
      
      [ Upstream commit 484d802d ]
      
      There is no need for complex checking between the last consumed index
      and current consumed index, a simple subtraction will do.
      
      This also eliminates the possibility of a permanent transmit queue stall
      under the following conditions:
      
      - one CPU bursts ring->size worth of traffic (up to 256 buffers), to the
        point where we run out of free descriptors, so we stop the transmit
        queue at the end of bcm_sysport_xmit()
      
      - because of our locking, we have the transmit process disable
        interrupts which means we can be blocking the TX reclamation process
      
      - when TX reclamation finally runs, we will be computing the difference
        between ring->c_index (last consumed index by SW) and what the HW
        reports through its register
      
      - this register is masked with (ring->size - 1) = 0xff, which will lead
        to stripping the upper bits of the index (register is 16-bits wide)
      
      - we will be computing last_tx_cn as 0, which means there is no work to
        be done, and we never wake-up the transmit queue, leaving it
        permanently disabled
      
      A practical example is e.g: ring->c_index aka last_c_index = 12, we
      pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12,
      so last_tx_cn == 0, nothing happens.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6841b47
    • David Ahern's avatar
      net: Only honor ifindex in IP_PKTINFO if non-0 · 474aa514
      David Ahern authored
      
      [ Upstream commit 2cbb4ea7 ]
      
      Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings
      if the index is actually set in the message.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      474aa514
    • Nicolas Dichtel's avatar
      netlink: avoid a double skb free in genlmsg_mcast() · 06d3f43d
      Nicolas Dichtel authored
      
      [ Upstream commit 02a2385f ]
      
      nlmsg_multicast() consumes always the skb, thus the original skb must be
      freed only when this function is called with a clone.
      
      Fixes: cb9f7a9a ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
      Reported-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06d3f43d
    • Arvind Yadav's avatar
      net/iucv: Free memory obtained by kzalloc · 2980f37b
      Arvind Yadav authored
      
      [ Upstream commit fa6a91e9 ]
      
      Free memory by calling put_device(), if afiucv_iucv_init is not
      successful.
      Signed-off-by: default avatarArvind Yadav <arvind.yadav.cs@gmail.com>
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarUrsula Braun <ursula.braun@de.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2980f37b
    • Florian Fainelli's avatar
      net: fec: Fix unbalanced PM runtime calls · a14b791d
      Florian Fainelli authored
      
      [ Upstream commit a069215c ]
      
      When unbinding/removing the driver, we will run into the following warnings:
      
      [  259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator
      [  259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable!
      [  259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00
      [  259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1
      [  259.696239] libphy: fec_enet_mii_bus: probed
      
      Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a14b791d
    • SZ Lin (林上智)'s avatar
      net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface · 9cdb0f25
      SZ Lin (林上智) authored
      
      [ Upstream commit f9db5069 ]
      
      According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and
      DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only
      available when PHY is configured in RGMII mode with 10Mbps speed. It will
      cause some networking issues without RGMII mode, such as carrier sense
      errors and low throughput. TI also mentioned this issue in their forum[4].
      
      This patch adds the check mechanism for PHY interface with RGMII interface
      type, the in-band mode can only be set in RGMII mode with 10Mbps speed.
      
      References:
      [1]: https://www.ti.com/lit/ug/spruh73p/spruh73p.pdf
      [2]: http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf
      [3]: http://www.ti.com/lit/ug/spruic2b/spruic2b.pdf
      [4]: https://e2e.ti.com/support/arm/sitara_arm/f/791/p/640765/2392155Suggested-by: default avatarHolsety Chen (陳憲輝) <Holsety.Chen@moxa.com>
      Signed-off-by: default avatarSZ Lin (林上智) <sz.lin@moxa.com>
      Signed-off-by: default avatarSchuyler Patton <spatton@ti.com>
      Reviewed-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9cdb0f25
    • Christophe JAILLET's avatar
      net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred · 89142a0e
      Christophe JAILLET authored
      
      [ Upstream commit 00777fac ]
      
      If the optional regulator is deferred, we must release some resources.
      They will be re-allocated when the probe function will be called again.
      
      Fixes: 6eacf311 ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89142a0e
    • Eric Dumazet's avatar
      l2tp: do not accept arbitrary sockets · 2d5b0ed0
      Eric Dumazet authored
      
      [ Upstream commit 17cfe79a ]
      
      syzkaller found an issue caused by lack of sufficient checks
      in l2tp_tunnel_create()
      
      RAW sockets can not be considered as UDP ones for instance.
      
      In another patch, we shall replace all pr_err() by less intrusive
      pr_debug() so that syzkaller can find other bugs faster.
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
      dst_release: dst:00000000d53d0d0f refcnt:-1
      Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242
      
      CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x24d lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report+0x23b/0x360 mm/kasan/report.c:412
       __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435
       setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
       l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596
       pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707
       SYSC_connect+0x213/0x4a0 net/socket.c:1640
       SyS_connect+0x24/0x30 net/socket.c:1621
       do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d5b0ed0
    • Lorenzo Bianconi's avatar
      ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option() · 18c64745
      Lorenzo Bianconi authored
      
      [ Upstream commit 9f62c15f ]
      
      Fix the following slab-out-of-bounds kasan report in
      ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
      linear and the accessed data are not in the linear data region of orig_skb.
      
      [ 1503.122508] ==================================================================
      [ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
      [ 1503.123036] Read of size 1184 at addr ffff8800298ab6b0 by task netperf/1932
      
      [ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
      [ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
      [ 1503.123527] Call Trace:
      [ 1503.123579]  <IRQ>
      [ 1503.123638]  print_address_description+0x6e/0x280
      [ 1503.123849]  kasan_report+0x233/0x350
      [ 1503.123946]  memcpy+0x1f/0x50
      [ 1503.124037]  ndisc_send_redirect+0x94e/0x990
      [ 1503.125150]  ip6_forward+0x1242/0x13b0
      [...]
      [ 1503.153890] Allocated by task 1932:
      [ 1503.153982]  kasan_kmalloc+0x9f/0xd0
      [ 1503.154074]  __kmalloc_track_caller+0xb5/0x160
      [ 1503.154198]  __kmalloc_reserve.isra.41+0x24/0x70
      [ 1503.154324]  __alloc_skb+0x130/0x3e0
      [ 1503.154415]  sctp_packet_transmit+0x21a/0x1810
      [ 1503.154533]  sctp_outq_flush+0xc14/0x1db0
      [ 1503.154624]  sctp_do_sm+0x34e/0x2740
      [ 1503.154715]  sctp_primitive_SEND+0x57/0x70
      [ 1503.154807]  sctp_sendmsg+0xaa6/0x1b10
      [ 1503.154897]  sock_sendmsg+0x68/0x80
      [ 1503.154987]  ___sys_sendmsg+0x431/0x4b0
      [ 1503.155078]  __sys_sendmsg+0xa4/0x130
      [ 1503.155168]  do_syscall_64+0x171/0x3f0
      [ 1503.155259]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ 1503.155436] Freed by task 1932:
      [ 1503.155527]  __kasan_slab_free+0x134/0x180
      [ 1503.155618]  kfree+0xbc/0x180
      [ 1503.155709]  skb_release_data+0x27f/0x2c0
      [ 1503.155800]  consume_skb+0x94/0xe0
      [ 1503.155889]  sctp_chunk_put+0x1aa/0x1f0
      [ 1503.155979]  sctp_inq_pop+0x2f8/0x6e0
      [ 1503.156070]  sctp_assoc_bh_rcv+0x6a/0x230
      [ 1503.156164]  sctp_inq_push+0x117/0x150
      [ 1503.156255]  sctp_backlog_rcv+0xdf/0x4a0
      [ 1503.156346]  __release_sock+0x142/0x250
      [ 1503.156436]  release_sock+0x80/0x180
      [ 1503.156526]  sctp_sendmsg+0xbb0/0x1b10
      [ 1503.156617]  sock_sendmsg+0x68/0x80
      [ 1503.156708]  ___sys_sendmsg+0x431/0x4b0
      [ 1503.156799]  __sys_sendmsg+0xa4/0x130
      [ 1503.156889]  do_syscall_64+0x171/0x3f0
      [ 1503.156980]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [ 1503.157158] The buggy address belongs to the object at ffff8800298ab600
                      which belongs to the cache kmalloc-1024 of size 1024
      [ 1503.157444] The buggy address is located 176 bytes inside of
                      1024-byte region [ffff8800298ab600, ffff8800298aba00)
      [ 1503.157702] The buggy address belongs to the page:
      [ 1503.157820] page:ffffea0000a62a00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
      [ 1503.158053] flags: 0x4000000000008100(slab|head)
      [ 1503.158171] raw: 4000000000008100 0000000000000000 0000000000000000 00000001800e000e
      [ 1503.158350] raw: dead000000000100 dead000000000200 ffff880036002600 0000000000000000
      [ 1503.158523] page dumped because: kasan: bad access detected
      
      [ 1503.158698] Memory state around the buggy address:
      [ 1503.158816]  ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1503.158988]  ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 1503.159165] >ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1503.159338]                    ^
      [ 1503.159436]  ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1503.159610]  ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1503.159785] ==================================================================
      [ 1503.159964] Disabling lock debugging due to kernel taint
      
      The test scenario to trigger the issue consists of 4 devices:
      - H0: data sender, connected to LAN0
      - H1: data receiver, connected to LAN1
      - GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
        ethernet connection on LAN0 and LAN1
      On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for
      data from LAN0 to LAN1.
      Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent
      data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send
      buffer size is set to 16K). While data streams are active flush the route
      cache on HA multiple times.
      I have not been able to identify a given commit that introduced the issue
      since, using the reproducer described above, the kasan report has been
      triggered from 4.14 and I have not gone back further.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18c64745
    • Alexey Kodanev's avatar
      dccp: check sk for closed state in dccp_sendmsg() · 91d27e0c
      Alexey Kodanev authored
      
      [ Upstream commit 67f93df7 ]
      
      dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
      therefore if DCCP socket is disconnected and dccp_sendmsg() is
      called after it, it will cause a NULL pointer dereference in
      dccp_write_xmit().
      
      This crash and the reproducer was reported by syzbot. Looks like
      it is reproduced if commit 69c64866 ("dccp: CVE-2017-8824:
      use-after-free in DCCP code") is applied.
      
      Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91d27e0c
    • Kirill Tkhai's avatar
      net: Fix hlist corruptions in inet_evict_bucket() · 946b9671
      Kirill Tkhai authored
      
      [ Upstream commit a5600024 ]
      
      inet_evict_bucket() iterates global list, and
      several tasks may call it in parallel. All of
      them hash the same fq->list_evictor to different
      lists, which leads to list corruption.
      
      This patch makes fq be hashed to expired list
      only if this has not been made yet by another
      task. Since inet_frag_alloc() allocates fq
      using kmem_cache_zalloc(), we may rely on
      list_evictor is initially unhashed.
      
      The problem seems to exist before async
      pernet_operations, as there was possible to have
      exit method to be executed in parallel with
      inet_frags::frags_work, so I add two Fixes tags.
      This also may go to stable.
      
      Fixes: d1fe1944 "inet: frag: don't re-use chainlist for evictor"
      Fixes: f84c6821 "net: Convert pernet_subsys, registered from inet_init()"
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      946b9671
    • Eric Dumazet's avatar
      net: use skb_to_full_sk() in skb_update_prio() · 4ff5078b
      Eric Dumazet authored
      
      [ Upstream commit 4dcb31d4 ]
      
      Andrei Vagin reported a KASAN: slab-out-of-bounds error in
      skb_update_prio()
      
      Since SYNACK might be attached to a request socket, we need to
      get back to the listener socket.
      Since this listener is manipulated without locks, add const
      qualifiers to sock_cgroup_prioidx() so that the const can also
      be used in skb_update_prio()
      
      Also add the const qualifier to sock_cgroup_classid() for consistency.
      
      Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ff5078b
    • Eric Dumazet's avatar
      ieee802154: 6lowpan: fix possible NULL deref in lowpan_device_event() · f6cdb675
      Eric Dumazet authored
      
      [ Upstream commit ca0edb13 ]
      
      A tun device type can trivially be set to arbitrary value using
      TUNSETLINK ioctl().
      
      Therefore, lowpan_device_event() must really check that ieee802154_ptr
      is not NULL.
      
      Fixes: 2c88b528 ("ieee802154: 6lowpan: remove check on null")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@osg.samsung.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarStefan Schmidt <stefan@osg.samsung.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6cdb675
    • Alexey Kodanev's avatar
      sch_netem: fix skb leak in netem_enqueue() · f77ff13a
      Alexey Kodanev authored
      
      [ Upstream commit 35d889d1 ]
      
      When we exceed current packets limit and we have more than one
      segment in the list returned by skb_gso_segment(), netem drops
      only the first one, skipping the rest, hence kmemleak reports:
      
      unreferenced object 0xffff880b5d23b600 (size 1024):
        comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s)
        hex dump (first 32 bytes):
          00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00  ..#]............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520
          [<000000001709b32f>] skb_segment+0x8c8/0x3710
          [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830
          [<00000000c921cba1>] inet_gso_segment+0x476/0x1370
          [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510
          [<000000002182660a>] __skb_gso_segment+0x1dd/0x620
          [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem]
          [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120
          [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00
          [<00000000d309e9d3>] ip_output+0x1aa/0x2c0
          [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670
          [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0
          [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540
          [<0000000013d06d02>] tasklet_action+0x1ca/0x250
          [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3
          [<00000000e7ed027c>] irq_exit+0x1e2/0x210
      
      Fix it by adding the rest of the segments, if any, to skb 'to_free'
      list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
      because they can be useful in the future if we need to drop segmented
      GSO packets in other places.
      
      Fixes: 6071bd1a ("netem: Segment GSO packets on enqueue")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f77ff13a
    • Tom Herbert's avatar
      kcm: lock lower socket in kcm_attach · 515bc341
      Tom Herbert authored
      
      [ Upstream commit 2cc683e8 ]
      
      Need to lock lower socket in order to provide mutual exclusion
      with kcm_unattach.
      
      v2: Add Reported-by for syzbot
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reported-by: syzbot+ea75c0ffcd353d32515f064aaebefc5279e6161e@syzkaller.appspotmail.com
      Signed-off-by: default avatarTom Herbert <tom@quantonium.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      515bc341
    • Paul Blakey's avatar
      rhashtable: Fix rhlist duplicates insertion · 07cf9d30
      Paul Blakey authored
      
      [ Upstream commit d3dcf8eb ]
      
      When inserting duplicate objects (those with the same key),
      current rhlist implementation messes up the chain pointers by
      updating the bucket pointer instead of prev next pointer to the
      newly inserted node. This causes missing elements on removal and
      travesal.
      
      Fix that by properly updating pprev pointer to point to
      the correct rhash_head next pointer.
      
      Issue: 1241076
      Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
      Fixes: ca26893f ('rhashtable: Add rhlist interface')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07cf9d30
    • Guillaume Nault's avatar
      ppp: avoid loop in xmit recursion detection code · 090da7ce
      Guillaume Nault authored
      
      [ Upstream commit 6d066734 ]
      
      We already detect situations where a PPP channel sends packets back to
      its upper PPP device. While this is enough to avoid deadlocking on xmit
      locks, this doesn't prevent packets from looping between the channel
      and the unit.
      
      The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq
      before checking for xmit recursion. Therefore, __ppp_xmit_process()
      might dequeue a packet from ppp->file.xq and send it on the channel
      which, in turn, loops it back on the unit. Then ppp_start_xmit()
      queues the packet back to ppp->file.xq and __ppp_xmit_process() picks
      it up and sends it again through the channel. Therefore, the packet
      will loop between __ppp_xmit_process() and ppp_start_xmit() until some
      other part of the xmit path drops it.
      
      For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops
      the packet after a few iterations. But PPTP reallocates the headroom
      if necessary, letting the loop run and exhaust the machine resources
      (as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109).
      
      Fix this by letting __ppp_xmit_process() enqueue the skb to
      ppp->file.xq, so that we can check for recursion before adding it to
      the queue. Now ppp_xmit_process() can drop the packet when recursion is
      detected.
      
      __ppp_channel_push() is a bit special. It calls __ppp_xmit_process()
      without having any actual packet to send. This is used by
      ppp_output_wakeup() to re-enable transmission on the parent unit (for
      implementations like ppp_async.c, where the .start_xmit() function
      might not consume the skb, leaving it in ppp->xmit_pending and
      disabling transmission).
      Therefore, __ppp_xmit_process() needs to handle the case where skb is
      NULL, dequeuing as many packets as possible from ppp->file.xq.
      Reported-by: default avatarxu heng <xuheng333@zoho.com>
      Fixes: 55454a56 ("ppp: avoid dealock on recursive xmit")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      090da7ce
    • Roman Mashak's avatar
      net sched actions: return explicit error when tunnel_key mode is not specified · 28b488f7
      Roman Mashak authored
      
      [ Upstream commit 51d4740f ]
      
      If set/unset mode of the tunnel_key action is not provided, ->init() still
      returns 0, and the caller proceeds with bogus 'struct tc_action *' object,
      this results in crash:
      
      % tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1
      
      [   35.805515] general protection fault: 0000 [#1] SMP PTI
      [   35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass
      crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
      crypto_simd glue_helper cryptd serio_raw
      [   35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286
      [   35.808929] RIP: 0010:tcf_action_init+0x90/0x190
      [   35.809457] RSP: 0018:ffffb8edc068b9a0 EFLAGS: 00010206
      [   35.810053] RAX: 1320c000000a0003 RBX: 0000000000000001 RCX: 0000000000000000
      [   35.810866] RDX: 0000000000000070 RSI: 0000000000007965 RDI: ffffb8edc068b910
      [   35.811660] RBP: ffffb8edc068b9d0 R08: 0000000000000000 R09: ffffb8edc068b808
      [   35.812463] R10: ffffffffc02bf040 R11: 0000000000000040 R12: ffffb8edc068bb38
      [   35.813235] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb8edc068b910
      [   35.814006] FS:  00007f3d0d8556c0(0000) GS:ffff91d1dbc40000(0000)
      knlGS:0000000000000000
      [   35.814881] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   35.815540] CR2: 000000000043f720 CR3: 0000000019248001 CR4: 00000000001606a0
      [   35.816457] Call Trace:
      [   35.817158]  tc_ctl_action+0x11a/0x220
      [   35.817795]  rtnetlink_rcv_msg+0x23d/0x2e0
      [   35.818457]  ? __slab_alloc+0x1c/0x30
      [   35.819079]  ? __kmalloc_node_track_caller+0xb1/0x2b0
      [   35.819544]  ? rtnl_calcit.isra.30+0xe0/0xe0
      [   35.820231]  netlink_rcv_skb+0xce/0x100
      [   35.820744]  netlink_unicast+0x164/0x220
      [   35.821500]  netlink_sendmsg+0x293/0x370
      [   35.822040]  sock_sendmsg+0x30/0x40
      [   35.822508]  ___sys_sendmsg+0x2c5/0x2e0
      [   35.823149]  ? pagecache_get_page+0x27/0x220
      [   35.823714]  ? filemap_fault+0xa2/0x640
      [   35.824423]  ? page_add_file_rmap+0x108/0x200
      [   35.825065]  ? alloc_set_pte+0x2aa/0x530
      [   35.825585]  ? finish_fault+0x4e/0x70
      [   35.826140]  ? __handle_mm_fault+0xbc1/0x10d0
      [   35.826723]  ? __sys_sendmsg+0x41/0x70
      [   35.827230]  __sys_sendmsg+0x41/0x70
      [   35.827710]  do_syscall_64+0x68/0x120
      [   35.828195]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      [   35.828859] RIP: 0033:0x7f3d0ca4da67
      [   35.829331] RSP: 002b:00007ffc9f284338 EFLAGS: 00000246 ORIG_RAX:
      000000000000002e
      [   35.830304] RAX: ffffffffffffffda RBX: 00007ffc9f284460 RCX: 00007f3d0ca4da67
      [   35.831247] RDX: 0000000000000000 RSI: 00007ffc9f2843b0 RDI: 0000000000000003
      [   35.832167] RBP: 000000005aa6a7a9 R08: 0000000000000001 R09: 0000000000000000
      [   35.833075] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
      [   35.833997] R13: 00007ffc9f2884c0 R14: 0000000000000001 R15: 0000000000674640
      [   35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2
      fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff <ff> d0 48
      8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c
      [   35.837442] RIP: tcf_action_init+0x90/0x190 RSP: ffffb8edc068b9a0
      [   35.838291] ---[ end trace a095c06ee4b97a26 ]---
      
      Fixes: d0f6dd8a ("net/sched: Introduce act_tunnel_key")
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28b488f7
    • Brad Mouring's avatar
      net: phy: Tell caller result of phy_change() · 2274d77c
      Brad Mouring authored
      
      [ Upstream commit a2c054a8 ]
      
      In 664fcf12 (net: phy: Threaded interrupts allow some simplification)
      the phy_interrupt system was changed to use a traditional threaded
      interrupt scheme instead of a workqueue approach.
      
      With this change, the phy status check moved into phy_change, which
      did not report back to the caller whether or not the interrupt was
      handled. This means that, in the case of a shared phy interrupt,
      only the first phydev's interrupt registers are checked (since
      phy_interrupt() would always return IRQ_HANDLED). This leads to
      interrupt storms when it is a secondary device that's actually the
      interrupt source.
      Signed-off-by: default avatarBrad Mouring <brad.mouring@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2274d77c
    • Ido Schimmel's avatar
      mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic · 42cf2a1e
      Ido Schimmel authored
      
      [ Upstream commit bcdd5de8 ]
      
      In commit 9ffcc372 ("mlxsw: spectrum: Allow packets to be trapped
      from any PG") I fixed a problem where packets could not be trapped to
      the CPU due to exceeded shared buffer quotas. The mentioned commit
      explains the problem in detail.
      
      The problem was fixed by assigning a minimum quota for the CPU port and
      the traffic class used for scheduling traffic to the CPU.
      
      However, commit 117b0dad ("mlxsw: Create a different trap group list
      for each device") assigned different traffic classes to different
      packet types and rendered the fix useless.
      
      Fix the problem by assigning a minimum quota for the CPU port and all
      the traffic classes that are currently in use.
      
      Fixes: 117b0dad ("mlxsw: Create a different trap group list for each device")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarEddie Shklaer <eddies@mellanox.com>
      Tested-by: default avatarEddie Shklaer <eddies@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42cf2a1e
    • David Lebrun's avatar
      ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state · dbad5abd
      David Lebrun authored
      
      [ Upstream commit 191f86ca ]
      
      The seg6_build_state() function is called with RCU read lock held,
      so we cannot use GFP_KERNEL. This patch uses GFP_ATOMIC instead.
      
      [   92.770271] =============================
      [   92.770628] WARNING: suspicious RCU usage
      [   92.770921] 4.16.0-rc4+ #12 Not tainted
      [   92.771277] -----------------------------
      [   92.771585] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section!
      [   92.772279]
      [   92.772279] other info that might help us debug this:
      [   92.772279]
      [   92.773067]
      [   92.773067] rcu_scheduler_active = 2, debug_locks = 1
      [   92.773514] 2 locks held by ip/2413:
      [   92.773765]  #0:  (rtnl_mutex){+.+.}, at: [<00000000e5461720>] rtnetlink_rcv_msg+0x441/0x4d0
      [   92.774377]  #1:  (rcu_read_lock){....}, at: [<00000000df4f161e>] lwtunnel_build_state+0x59/0x210
      [   92.775065]
      [   92.775065] stack backtrace:
      [   92.775371] CPU: 0 PID: 2413 Comm: ip Not tainted 4.16.0-rc4+ #12
      [   92.775791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014
      [   92.776608] Call Trace:
      [   92.776852]  dump_stack+0x7d/0xbc
      [   92.777130]  __schedule+0x133/0xf00
      [   92.777393]  ? unwind_get_return_address_ptr+0x50/0x50
      [   92.777783]  ? __sched_text_start+0x8/0x8
      [   92.778073]  ? rcu_is_watching+0x19/0x30
      [   92.778383]  ? kernel_text_address+0x49/0x60
      [   92.778800]  ? __kernel_text_address+0x9/0x30
      [   92.779241]  ? unwind_get_return_address+0x29/0x40
      [   92.779727]  ? pcpu_alloc+0x102/0x8f0
      [   92.780101]  _cond_resched+0x23/0x50
      [   92.780459]  __mutex_lock+0xbd/0xad0
      [   92.780818]  ? pcpu_alloc+0x102/0x8f0
      [   92.781194]  ? seg6_build_state+0x11d/0x240
      [   92.781611]  ? save_stack+0x9b/0xb0
      [   92.781965]  ? __ww_mutex_wakeup_for_backoff+0xf0/0xf0
      [   92.782480]  ? seg6_build_state+0x11d/0x240
      [   92.782925]  ? lwtunnel_build_state+0x1bd/0x210
      [   92.783393]  ? ip6_route_info_create+0x687/0x1640
      [   92.783846]  ? ip6_route_add+0x74/0x110
      [   92.784236]  ? inet6_rtm_newroute+0x8a/0xd0
      
      Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
      Signed-off-by: default avatarDavid Lebrun <dlebrun@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbad5abd
    • David Lebrun's avatar
      ipv6: sr: fix NULL pointer dereference when setting encap source address · cb4963b4
      David Lebrun authored
      
      [ Upstream commit 8936ef76 ]
      
      When using seg6 in encap mode, we call ipv6_dev_get_saddr() to set the
      source address of the outer IPv6 header, in case none was specified.
      Using skb->dev can lead to BUG() when it is in an inconsistent state.
      This patch uses the net_device attached to the skb's dst instead.
      
      [940807.667429] BUG: unable to handle kernel NULL pointer dereference at 000000000000047c
      [940807.762427] IP: ipv6_dev_get_saddr+0x8b/0x1d0
      [940807.815725] PGD 0 P4D 0
      [940807.847173] Oops: 0000 [#1] SMP PTI
      [940807.890073] Modules linked in:
      [940807.927765] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G        W        4.16.0-rc1-seg6bpf+ #2
      [940808.028988] Hardware name: HP ProLiant DL120 G6/ProLiant DL120 G6, BIOS O26    09/06/2010
      [940808.128128] RIP: 0010:ipv6_dev_get_saddr+0x8b/0x1d0
      [940808.187667] RSP: 0018:ffff88043fd836b0 EFLAGS: 00010206
      [940808.251366] RAX: 0000000000000005 RBX: ffff88042cb1c860 RCX: 00000000000000fe
      [940808.338025] RDX: 00000000000002c0 RSI: ffff88042cb1c860 RDI: 0000000000004500
      [940808.424683] RBP: ffff88043fd83740 R08: 0000000000000000 R09: ffffffffffffffff
      [940808.511342] R10: 0000000000000040 R11: 0000000000000000 R12: ffff88042cb1c850
      [940808.598012] R13: ffffffff8208e380 R14: ffff88042ac8da00 R15: 0000000000000002
      [940808.684675] FS:  0000000000000000(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
      [940808.783036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [940808.852975] CR2: 000000000000047c CR3: 00000004255fe000 CR4: 00000000000006e0
      [940808.939634] Call Trace:
      [940808.970041]  <IRQ>
      [940808.995250]  ? ip6t_do_table+0x265/0x640
      [940809.043341]  seg6_do_srh_encap+0x28f/0x300
      [940809.093516]  ? seg6_do_srh+0x1a0/0x210
      [940809.139528]  seg6_do_srh+0x1a0/0x210
      [940809.183462]  seg6_output+0x28/0x1e0
      [940809.226358]  lwtunnel_output+0x3f/0x70
      [940809.272370]  ip6_xmit+0x2b8/0x530
      [940809.313185]  ? ac6_proc_exit+0x20/0x20
      [940809.359197]  inet6_csk_xmit+0x7d/0xc0
      [940809.404173]  tcp_transmit_skb+0x548/0x9a0
      [940809.453304]  __tcp_retransmit_skb+0x1a8/0x7a0
      [940809.506603]  ? ip6_default_advmss+0x40/0x40
      [940809.557824]  ? tcp_current_mss+0x24/0x90
      [940809.605925]  tcp_retransmit_skb+0xd/0x80
      [940809.654016]  tcp_xmit_retransmit_queue.part.17+0xf9/0x210
      [940809.719797]  tcp_ack+0xa47/0x1110
      [940809.760612]  tcp_rcv_established+0x13c/0x570
      [940809.812865]  tcp_v6_do_rcv+0x151/0x3d0
      [940809.858879]  tcp_v6_rcv+0xa5c/0xb10
      [940809.901770]  ? seg6_output+0xdd/0x1e0
      [940809.946745]  ip6_input_finish+0xbb/0x460
      [940809.994837]  ip6_input+0x74/0x80
      [940810.034612]  ? ip6_rcv_finish+0xb0/0xb0
      [940810.081663]  ipv6_rcv+0x31c/0x4c0
      ...
      
      Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
      Reported-by: default avatarTom Herbert <tom@quantonium.net>
      Signed-off-by: default avatarDavid Lebrun <dlebrun@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb4963b4
    • Stefano Brivio's avatar
      ipv6: old_dport should be a __be16 in __ip6_datagram_connect() · 5defa8c9
      Stefano Brivio authored
      
      [ Upstream commit 5f2fb802 ]
      
      Fixes: 2f987a76 ("net: ipv6: keep sk status consistent after datagram connect failure")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5defa8c9
    • Paolo Abeni's avatar
      net: ipv6: keep sk status consistent after datagram connect failure · a8f02bef
      Paolo Abeni authored
      
      [ Upstream commit 2f987a76 ]
      
      On unsuccesful ip6_datagram_connect(), if the failure is caused by
      ip6_datagram_dst_update(), the sk peer information are cleared, but
      the sk->sk_state is preserved.
      
      If the socket was already in an established status, the overall sk
      status is inconsistent and fouls later checks in datagram code.
      
      Fix this saving the old peer information and restoring them in
      case of failure. This also aligns ipv6 datagram connect() behavior
      with ipv4.
      
      v1 -> v2:
       - added missing Fixes tag
      
      Fixes: 85cb73ff ("net: ipv6: reset daddr and dport in sk if connect() fails")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8f02bef
    • Shannon Nelson's avatar
      macvlan: filter out unsupported feature flags · 82fb8178
      Shannon Nelson authored
      
      [ Upstream commit 13fbcc8d ]
      
      Adding a macvlan device on top of a lowerdev that supports
      the xfrm offloads fails with a new regression:
        # ip link add link ens1f0 mv0 type macvlan
        RTNETLINK answers: Operation not permitted
      
      Tracing down the failure shows that the macvlan device inherits
      the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
      from the lowerdev, but with no dev->xfrmdev_ops API filled
      in, it doesn't actually support xfrm.  When the request is
      made to add the new macvlan device, the XFRM listener for
      NETDEV_REGISTER calls xfrm_api_check() which fails the new
      registration because dev->xfrmdev_ops is NULL.
      
      The macvlan creation succeeds when we filter out the ESP
      feature flags in macvlan_fix_features(), so let's filter them
      out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
      When XFRM support is added in the future, we can add the flags
      into MACVLAN_FEATURES.
      
      This same problem could crop up in the future with any other
      new feature flags, so let's filter out any flags that aren't
      defined as supported in macvlan.
      
      Fixes: d77e38e6 ("xfrm: Add an IPsec hardware offloading API")
      Reported-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82fb8178
    • Arkadi Sharshevsky's avatar
      devlink: Remove redundant free on error path · b51eb57d
      Arkadi Sharshevsky authored
      
      [ Upstream commit 7fe4d6dc ]
      
      The current code performs unneeded free. Remove the redundant skb freeing
      during the error path.
      
      Fixes: 1555d204 ("devlink: Support for pipeline debug (dpipe)")
      Signed-off-by: default avatarArkadi Sharshevsky <arkadis@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b51eb57d
    • Grygorii Strashko's avatar
      net: phy: relax error checking when creating sysfs link netdev->phydev · 67a1dc56
      Grygorii Strashko authored
      
      [ Upstream commit 4414b3ed ]
      
      Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
      one netdevice, as result such drivers will produce warning during system
      boot and fail to connect second phy to netdevice when PHYLIB framework
      will try to create sysfs link netdev->phydev for second PHY
      in phy_attach_direct(), because sysfs link with the same name has been
      created already for the first PHY. As result, second CPSW external
      port will became unusable.
      
      Fix it by relaxing error checking when PHYLIB framework is creating sysfs
      link netdev->phydev in phy_attach_direct(), suppressing warning by using
      sysfs_create_link_nowarn() and adding error message instead.
      After this change links (phy->netdev and netdev->phy) creation failure is not
      fatal any more and system can continue working, which fixes TI CPSW issue.
      
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Fixes: a3995460 ("net: phy: Relax error checking on sysfs_create_link()")
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      67a1dc56
    • Grygorii Strashko's avatar
      sysfs: symlink: export sysfs_create_link_nowarn() · 223c5424
      Grygorii Strashko authored
      
      [ Upstream commit 2399ac42 ]
      
      The sysfs_create_link_nowarn() is going to be used in phylib framework in
      subsequent patch which can be built as module. Hence, export
      sysfs_create_link_nowarn() to avoid build errors.
      
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Andrew Lunn <andrew@lunn.ch>
      Fixes: a3995460 ("net: phy: Relax error checking on sysfs_create_link()")
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      223c5424
    • Michal Kalderon's avatar
      qed: Fix non TCP packets should be dropped on iWARP ll2 connection · 497166d6
      Michal Kalderon authored
      
      [ Upstream commit 16da0904 ]
      
      FW workaround. The iWARP LL2 connection did not expect TCP packets
      to arrive on it's connection. The fix drops any non-tcp packets
      
      Fixes b5c29ca7 ("qed: iWARP CM - setup a ll2 connection for handling
      SYN packets")
      Signed-off-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Signed-off-by: default avatarAriel Elior <Ariel.Elior@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      497166d6
    • Soheil Hassas Yeganeh's avatar
      tcp: purge write queue upon aborting the connection · e44c1733
      Soheil Hassas Yeganeh authored
      
      [ Upstream commit e05836ac ]
      
      When the connection is aborted, there is no point in
      keeping the packets on the write queue until the connection
      is closed.
      
      Similar to a27fd7a8 ('tcp: purge write queue upon RST'),
      this is essential for a correct MSG_ZEROCOPY implementation,
      because userspace cannot call close(fd) before receiving
      zerocopy signals even when the connection is aborted.
      
      Fixes: f214f915 ("tcp: enable MSG_ZEROCOPY")
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e44c1733
    • Soheil Hassas Yeganeh's avatar
      tcp: reset sk_send_head in tcp_write_queue_purge · dbbf2d1e
      Soheil Hassas Yeganeh authored
      
      tcp_write_queue_purge clears all the SKBs in the write queue
      but does not reset the sk_send_head. As a result, we can have
      a NULL pointer dereference anywhere that we use tcp_send_head
      instead of the tcp_write_queue_tail.
      
      For example, after a27fd7a8 (tcp: purge write queue upon RST),
      we can purge the write queue on RST. Prior to
      75c119af (tcp: implement rb-tree based retransmit queue),
      tcp_push will only check tcp_send_head and then accesses
      tcp_write_queue_tail to send the actual SKB. As a result, it will
      dereference a NULL pointer.
      
      This has been reported twice for 4.14 where we don't have
      75c119af:
      
      By Timofey Titovets:
      
      [  422.081094] BUG: unable to handle kernel NULL pointer dereference
      at 0000000000000038
      [  422.081254] IP: tcp_push+0x42/0x110
      [  422.081314] PGD 0 P4D 0
      [  422.081364] Oops: 0002 [#1] SMP PTI
      
      By Yongjian Xu:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
      IP: tcp_push+0x48/0x120
      PGD 80000007ff77b067 P4D 80000007ff77b067 PUD 7fd989067 PMD 0
      Oops: 0002 [#18] SMP PTI
      Modules linked in: tcp_diag inet_diag tcp_bbr sch_fq iTCO_wdt
      iTCO_vendor_support pcspkr ixgbe mdio i2c_i801 lpc_ich joydev input_leds shpchp
      e1000e igb dca ptp pps_core hwmon mei_me mei ipmi_si ipmi_msghandler sg ses
      scsi_transport_sas enclosure ext4 jbd2 mbcache sd_mod ahci libahci megaraid_sas
      wmi ast ttm dm_mirror dm_region_hash dm_log dm_mod dax
      CPU: 6 PID: 14156 Comm: [ET_NET 6] Tainted: G D 4.14.26-1.el6.x86_64 #1
      Hardware name: LENOVO ThinkServer RD440 /ThinkServer RD440, BIOS A0TS80A
      09/22/2014
      task: ffff8807d78d8140 task.stack: ffffc9000e944000
      RIP: 0010:tcp_push+0x48/0x120
      RSP: 0018:ffffc9000e947a88 EFLAGS: 00010246
      RAX: 00000000000005b4 RBX: ffff880f7cce9c00 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff8807d00f5000
      RBP: ffffc9000e947aa8 R08: 0000000000001c84 R09: 0000000000000000
      R10: ffff8807d00f5158 R11: 0000000000000000 R12: ffff8807d00f5000
      R13: 0000000000000020 R14: 00000000000256d4 R15: 0000000000000000
      FS: 00007f5916de9700(0000) GS:ffff88107fd00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000038 CR3: 00000007f8226004 CR4: 00000000001606e0
      Call Trace:
      tcp_sendmsg_locked+0x33d/0xe50
      tcp_sendmsg+0x37/0x60
      inet_sendmsg+0x39/0xc0
      sock_sendmsg+0x49/0x60
      sock_write_iter+0xb6/0x100
      do_iter_readv_writev+0xec/0x130
      ? rw_verify_area+0x49/0xb0
      do_iter_write+0x97/0xd0
      vfs_writev+0x7e/0xe0
      ? __wake_up_common_lock+0x80/0xa0
      ? __fget_light+0x2c/0x70
      ? __do_page_fault+0x1e7/0x530
      do_writev+0x60/0xf0
      ? inet_shutdown+0xac/0x110
      SyS_writev+0x10/0x20
      do_syscall_64+0x6f/0x140
      ? prepare_exit_to_usermode+0x8b/0xa0
      entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x3135ce0c57
      RSP: 002b:00007f5916de4b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000003135ce0c57
      RDX: 0000000000000002 RSI: 00007f5916de4b90 RDI: 000000000000606f
      RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f5916de8c38
      R10: 0000000000000000 R11: 0000000000000293 R12: 00000000000464cc
      R13: 00007f5916de8c30 R14: 00007f58d8bef080 R15: 0000000000000002
      Code: 48 8b 97 60 01 00 00 4c 8d 97 58 01 00 00 41 b9 00 00 00 00 41 89 f3 4c 39
      d2 49 0f 44 d1 41 81 e3 00 80 00 00 0f 85 b0 00 00 00 <80> 4a 38 08 44 8b 8f 74
      06 00 00 44 89 8f 7c 06 00 00 83 e6 01
      RIP: tcp_push+0x48/0x120 RSP: ffffc9000e947a88
      CR2: 0000000000000038
      ---[ end trace 8d545c2e93515549 ]---
      
      Fixes: a27fd7a8 (tcp: purge write queue upon RST)
      Reported-by: default avatarTimofey Titovets <nefelim4ag@gmail.com>
      Reported-by: default avatarYongjian Xu <yongjianchn@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Tested-by: default avatarYongjian Xu <yongjianchn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbbf2d1e
  2. 28 Mar, 2018 4 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.31 · 9861e664
      Greg Kroah-Hartman authored
      9861e664
    • Daniel Borkmann's avatar
      bpf, x64: increase number of passes · 7514cd2f
      Daniel Borkmann authored
      commit 6007b080 upstream.
      
      In Cilium some of the main programs we run today are hitting 9 passes
      on x64's JIT compiler, and we've had cases already where we surpassed
      the limit where the JIT then punts the program to the interpreter
      instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
      or insertion failures due to the prog array owner being JITed but the
      program to insert not (both must have the same JITed/non-JITed property).
      
      One concrete case the program image shrunk from 12,767 bytes down to
      10,288 bytes where the image converged after 16 steps. I've measured
      that this took 340us in the JIT until it converges on my i7-6600U. Thus,
      increase the original limit we had from day one where the JIT covered
      cBPF only back then before we run into the case (as similar with the
      complexity limit) where we trip over this and hit program rejections.
      Also add a cond_resched() into the compilation loop, the JIT process
      runs without any locks and may sleep anyway.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7514cd2f
    • Chenbo Feng's avatar
      bpf: skip unnecessary capability check · b4e02202
      Chenbo Feng authored
      commit 0fa4fe85 upstream.
      
      The current check statement in BPF syscall will do a capability check
      for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
      code path will trigger unnecessary security hooks on capability checking
      and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
      access. This can be resolved by simply switch the order of the statement
      and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
      allowed.
      Signed-off-by: default avatarChenbo Feng <fengc@google.com>
      Acked-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4e02202
    • Daniel Borkmann's avatar
      kbuild: disable clang's default use of -fmerge-all-constants · 3e113097
      Daniel Borkmann authored
      commit 87e0d4f0 upstream.
      
      Prasad reported that he has seen crashes in BPF subsystem with netd
      on Android with arm64 in the form of (note, the taint is unrelated):
      
        [ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
        [ 4134.820925] Mem abort info:
        [ 4134.901283]   Exception class = DABT (current EL), IL = 32 bits
        [ 4135.016736]   SET = 0, FnV = 0
        [ 4135.119820]   EA = 0, S1PTW = 0
        [ 4135.201431] Data abort info:
        [ 4135.301388]   ISV = 0, ISS = 0x00000021
        [ 4135.359599]   CM = 0, WnR = 0
        [ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
        [ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
        [ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
        [ 4135.674610] Modules linked in:
        [ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S      W       4.14.19+ #1
        [ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
        [ 4135.731599] PC is at bpf_prog_add+0x20/0x68
        [ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
        [ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
        [ 4135.769062] sp : ffffff801d4e3ce0
        [...]
        [ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
        [ 4136.273746] Call trace:
        [...]
        [ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
        [ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
        [ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
        [ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
        [ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
        [ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
      
      Android's netd was basically pinning the uid cookie BPF map in BPF
      fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
      again resulting in above panic. Issue is that the map was wrongly
      identified as a prog! Above kernel was compiled with clang 4.0,
      and it turns out that clang decided to merge the bpf_prog_iops and
      bpf_map_iops into a single memory location, such that the two i_ops
      could then not be distinguished anymore.
      
      Reason for this miscompilation is that clang has the more aggressive
      -fmerge-all-constants enabled by default. In fact, clang source code
      has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
      to do so:
      
        Pointers with different bases cannot represent the same object.
        (Note that clang defaults to -fmerge-all-constants, which can
        lead to inconsistent results for comparisons involving the address
        of a constant; this generally doesn't matter in practice.)
      
      The issue never appeared with gcc however, since gcc does not enable
      -fmerge-all-constants by default and even *explicitly* states in
      it's option description that using this flag results in non-conforming
      behavior, quote from man gcc:
      
        Languages like C or C++ require each variable, including multiple
        instances of the same variable in recursive calls, to have distinct
        locations, so using this option results in non-conforming behavior.
      
      There are also various clang bug reports open on that matter [1],
      where clang developers acknowledge the non-conforming behavior,
      and refer to disabling it with -fno-merge-all-constants. But even
      if this gets fixed in clang today, there are already users out there
      that triggered this. Thus, fix this issue by explicitly adding
      -fno-merge-all-constants to the kernel's Makefile to generically
      disable this optimization, since potentially other places in the
      kernel could subtly break as well.
      
      Note, there is also a flag called -fmerge-constants (not supported
      by clang), which is more conservative and only applies to strings
      and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
      gcc's code, the two flags -fmerge-{all-,}constants share the same
      variable internally, so when disabling it via -fno-merge-all-constants,
      then we really don't merge any const data (e.g. strings), and text
      size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
      
        $ gcc -fverbose-asm -O2 foo.c -S -o foo.S
          -> foo.S lists -fmerge-constants under options enabled
        $ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
          -> foo.S doesn't list -fmerge-constants under options enabled
        $ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
          -> foo.S lists -fmerge-constants under options enabled
      
      Thus, as a workaround we need to set both -fno-merge-all-constants
      *and* -fmerge-constants in the Makefile in order for text size to
      stay as is.
      
        [1] https://bugs.llvm.org/show_bug.cgi?id=18538Reported-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Chenbo Feng <fengc@google.com>
      Cc: Richard Smith <richard-llvm@metafoo.co.uk>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: linux-kernel@vger.kernel.org
      Tested-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e113097