1. 10 Mar, 2019 40 commits
    • Vlad Buslov's avatar
      net: sched: act_tunnel_key: fix NULL pointer dereference during init · 38460809
      Vlad Buslov authored
      [ Upstream commit a3df633a ]
      
      Metadata pointer is only initialized for action TCA_TUNNEL_KEY_ACT_SET, but
      it is unconditionally dereferenced in tunnel_key_init() error handler.
      Verify that metadata pointer is not NULL before dereferencing it in
      tunnel_key_init error handling code.
      
      Fixes: ee28bb56 ("net/sched: fix memory leak in act_tunnel_key_init()")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38460809
    • Davide Caratti's avatar
      net/sched: act_skbedit: fix refcount leak when replace fails · 69e6fb18
      Davide Caratti authored
      [ Upstream commit 6191da98 ]
      
      when act_skbedit was converted to use RCU in the data plane, we added an
      error path, but we forgot to drop the action refcount in case of failure
      during a 'replace' operation:
      
       # tc actions add action skbedit ptype otherhost pass index 100
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 1 bind 0
       # tc actions replace action skbedit ptype otherhost drop index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'params_new' allocation failed,
      also when the action is being replaced.
      
      Fixes: c749cdda ("net/sched: act_skbedit: don't use spinlock in the data path")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69e6fb18
    • Davide Caratti's avatar
      net/sched: act_ipt: fix refcount leak when replace fails · f1446b16
      Davide Caratti authored
      [ Upstream commit 8f67c90e ]
      
      After commit 4e8ddd7f ("net: sched: don't release reference on action
      overwrite"), the error path of all actions was converted to drop refcount
      also when the action was being overwritten. But we forgot act_ipt_init(),
      in case allocation of 'tname' was not successful:
      
       # tc action add action xt -j LOG --log-prefix hello index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "hello" index 100
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 1 bind 0
       # tc action replace action xt -j LOG --log-prefix world index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "world" index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'tname' allocation failed, also
      when the action is being replaced.
      
      Fixes: 4e8ddd7f ("net: sched: don't release reference on action overwrite")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1446b16
    • Heiner Kallweit's avatar
      net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode · 4d8f5df0
      Heiner Kallweit authored
      [ Upstream commit ed8fe202 ]
      
      When debugging another issue I faced an interrupt storm in this
      driver (88E6390, port 9 in SGMII mode), consisting of alternating
      link-up / link-down interrupts. Analysis showed that the driver
      wanted to set a cmode that was set already. But so far
      mv88e6390x_port_set_cmode() doesn't check this and powers down
      SERDES, what causes the link to break, and eventually results in
      the described interrupt storm.
      
      Fix this by checking whether the cmode actually changes. We want
      that the very first call to mv88e6390x_port_set_cmode() always
      configures the registers, therefore initialize port.cmode with
      a value that is different from any supported cmode value.
      We have to take care that we only init the ports cmode once
      chip->info->num_ports is set.
      
      v2:
      - add small helper and init the number of actual ports only
      
      Fixes: 364e9d77 ("net: dsa: mv88e6xxx: Power on/off SERDES on cmode change")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d8f5df0
    • Maxime Chevallier's avatar
      net: dsa: mv88e6xxx: power serdes on/off for 10G interfaces on 6390X · 457c1190
      Maxime Chevallier authored
      [ Upstream commit d235c48b ]
      
      Upon setting the cmode on 6390 and 6390X, the associated serdes
      interfaces must be powered off/on.
      
      Both 6390X and 6390 share code to do so, but it currently uses the 6390
      specific helper mv88e6390_serdes_power() to disable and enable the
      serdes interface.
      
      This call will fail silently on 6390X when trying so set a 10G interface
      such as XAUI or RXAUI, since mv88e6390_serdes_power() internally grabs
      the lane number based on modes supported by the 6390, and returns 0 when
      getting -ENODEV as a lane number.
      
      Using mv88e6390x_serdes_power() should be safe here, since we explicitly
      rule-out all ports but the 9 and 10, and because modes supported by 6390
      ports 9 and 10 are a subset of those supported on 6390X.
      
      This was tested on 6390X using RXAUI mode.
      
      Fixes: 364e9d77 ("net: dsa: mv88e6xxx: Power on/off SERDES on cmode change")
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      457c1190
    • David Ahern's avatar
      ipv4: Pass original device to ip_rcv_finish_core · cc211561
      David Ahern authored
      [ Upstream commit a1fd1ad2 ]
      
      ip_route_input_rcu expects the original ingress device (e.g., for
      proper multicast handling). The skb->dev can be changed by l3mdev_ip_rcv,
      so dev needs to be saved prior to calling it. This was the behavior prior
      to the listify changes.
      
      Fixes: 5fa12739 ("net: ipv4: listify ip_rcv_finish")
      Cc: Edward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc211561
    • David Ahern's avatar
      mpls: Return error for RTA_GATEWAY attribute · 4f3221de
      David Ahern authored
      [ Upstream commit be48220e ]
      
      MPLS does not support nexthops with an MPLS address family.
      Specifically, it does not handle RTA_GATEWAY attribute. Make it
      clear by returning an error.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f3221de
    • David Ahern's avatar
      ipv6: Return error for RTA_VIA attribute · a68d31cc
      David Ahern authored
      [ Upstream commit e3818541 ]
      
      IPv6 currently does not support nexthops outside of the AF_INET6 family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        $ ip ro ls
        ...
        2001:db8:2::/64 dev eth0 metric 1024 pref medium
      
      Catch this and fail the route add:
        $ ip -6 ro add 2001:db8:2::/64 via inet 172.16.1.1 dev eth0
        Error: IPv6 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a68d31cc
    • David Ahern's avatar
      ipv4: Return error for RTA_VIA attribute · 8c0aa3f6
      David Ahern authored
      [ Upstream commit b6e9e5df ]
      
      IPv4 currently does not support nexthops outside of the AF_INET family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        $ ip ro ls
        ...
        172.16.1.0/24 dev eth0
      
      Catch this and fail the route add:
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        Error: IPv4 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c0aa3f6
    • Nazarov Sergey's avatar
      net: avoid use IPCB in cipso_v4_error · 125bc1e6
      Nazarov Sergey authored
      [ Upstream commit 3da1ed7a ]
      
      Extract IP options in cipso_v4_error and use __icmp_send.
      Signed-off-by: default avatarSergey Nazarov <s-nazarov@yandex.ru>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      125bc1e6
    • Nazarov Sergey's avatar
      net: Add __icmp_send helper. · f2397468
      Nazarov Sergey authored
      [ Upstream commit 9ef6b42a ]
      
      Add __icmp_send function having ip_options struct parameter
      Signed-off-by: default avatarSergey Nazarov <s-nazarov@yandex.ru>
      Reviewed-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2397468
    • Timur Celik's avatar
      tun: remove unnecessary memory barrier · e6620def
      Timur Celik authored
      [ Upstream commit ecef67cb ]
      
      Replace set_current_state with __set_current_state since no memory
      barrier is needed at this point.
      Signed-off-by: default avatarTimur Celik <mail@timurcelik.de>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6620def
    • Igor Druzhinin's avatar
      xen-netback: fix occasional leak of grant ref mappings under memory pressure · 947fc52b
      Igor Druzhinin authored
      [ Upstream commit 99e87f56 ]
      
      Zero-copy callback flag is not yet set on frag list skb at the moment
      xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
      leaking grant ref mappings since xenvif_zerocopy_callback() is never
      called for these fragments. Those eventually build up and cause Xen
      to kill Dom0 as the slots get reused for new mappings:
      
      "d0v0 Attempt to implicitly unmap a granted PTE c010000329fce005"
      
      That behavior is observed under certain workloads where sudden spikes
      of page cache writes coexist with active atomic skb allocations from
      network traffic. Additionally, rework the logic to deal with frag_list
      deallocation in a single place.
      Signed-off-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: default avatarIgor Druzhinin <igor.druzhinin@citrix.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      947fc52b
    • Igor Druzhinin's avatar
      xen-netback: don't populate the hash cache on XenBus disconnect · e5e58401
      Igor Druzhinin authored
      [ Upstream commit a2288d4e ]
      
      Occasionally, during the disconnection procedure on XenBus which
      includes hash cache deinitialization there might be some packets
      still in-flight on other processors. Handling of these packets includes
      hashing and hash cache population that finally results in hash cache
      data structure corruption.
      
      In order to avoid this we prevent hashing of those packets if there
      are no queues initialized. In that case RCU protection of queues guards
      the hash cache as well.
      Signed-off-by: default avatarIgor Druzhinin <igor.druzhinin@citrix.com>
      Reviewed-by: default avatarPaul Durrant <paul.durrant@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5e58401
    • Timur Celik's avatar
      tun: fix blocking read · 488b9407
      Timur Celik authored
      [ Upstream commit 71828b22 ]
      
      This patch moves setting of the current state into the loop. Otherwise
      the task may end up in a busy wait loop if none of the break conditions
      are met.
      Signed-off-by: default avatarTimur Celik <mail@timurcelik.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      488b9407
    • Tung Nguyen's avatar
      tipc: fix race condition causing hung sendto · ab04570d
      Tung Nguyen authored
      [ Upstream commit bfd07f3d ]
      
      When sending multicast messages via blocking socket,
      if sending link is congested (tsk->cong_link_cnt is set to 1),
      the sending thread will be put into sleeping state. However,
      tipc_sk_filter_rcv() is called under socket spin lock but
      tipc_wait_for_cond() is not. So, there is no guarantee that
      the setting of tsk->cong_link_cnt to 0 in tipc_sk_proto_rcv() in
      CPU-1 will be perceived by CPU-0. If that is the case, the sending
      thread in CPU-0 after being waken up, will continue to see
      tsk->cong_link_cnt as 1 and put the sending thread into sleeping
      state again. The sending thread will sleep forever.
      
      CPU-0                                | CPU-1
      tipc_wait_for_cond()                 |
      {                                    |
       // condition_ = !tsk->cong_link_cnt |
       while ((rc_ = !(condition_))) {     |
        ...                                |
        release_sock(sk_);                 |
        wait_woken();                      |
                                           | if (!sock_owned_by_user(sk))
                                           |  tipc_sk_filter_rcv()
                                           |  {
                                           |   ...
                                           |   tipc_sk_proto_rcv()
                                           |   {
                                           |    ...
                                           |    tsk->cong_link_cnt--;
                                           |    ...
                                           |    sk->sk_write_space(sk);
                                           |    ...
                                           |   }
                                           |   ...
                                           |  }
        sched_annotate_sleep();            |
        lock_sock(sk_);                    |
        remove_wait_queue();               |
       }                                   |
      }                                    |
      
      This commit fixes it by adding memory barrier to tipc_sk_proto_rcv()
      and tipc_wait_for_cond().
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab04570d
    • Eric Biggers's avatar
      net: socket: set sock->sk to NULL after calling proto_ops::release() · 5fdb551f
      Eric Biggers authored
      [ Upstream commit ff7b11aa ]
      
      Commit 9060cb71 ("net: crypto set sk to NULL when af_alg_release.")
      fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
      closed concurrently with fchownat().  However, it ignored that many
      other proto_ops::release() methods don't set sock->sk to NULL and
      therefore allow the same use-after-free:
      
          - base_sock_release
          - bnep_sock_release
          - cmtp_sock_release
          - data_sock_release
          - dn_release
          - hci_sock_release
          - hidp_sock_release
          - iucv_sock_release
          - l2cap_sock_release
          - llcp_sock_release
          - llc_ui_release
          - rawsock_release
          - rfcomm_sock_release
          - sco_sock_release
          - svc_release
          - vcc_release
          - x25_release
      
      Rather than fixing all these and relying on every socket type to get
      this right forever, just make __sock_release() set sock->sk to NULL
      itself after calling proto_ops::release().
      
      Reproducer that produces the KASAN splat when any of these socket types
      are configured into the kernel:
      
          #include <pthread.h>
          #include <stdlib.h>
          #include <sys/socket.h>
          #include <unistd.h>
      
          pthread_t t;
          volatile int fd;
      
          void *close_thread(void *arg)
          {
              for (;;) {
                  usleep(rand() % 100);
                  close(fd);
              }
          }
      
          int main()
          {
              pthread_create(&t, NULL, close_thread, NULL);
              for (;;) {
                  fd = socket(rand() % 50, rand() % 11, 0);
                  fchownat(fd, "", 1000, 1000, 0x1000);
                  close(fd);
              }
          }
      
      Fixes: 86741ec2 ("net: core: Add a UID field to struct sock.")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5fdb551f
    • Mao Wenan's avatar
      net: sit: fix memory leak in sit_init_net() · d0bedaac
      Mao Wenan authored
      [ Upstream commit 07f12b26 ]
      
      If register_netdev() is failed to register sitn->fb_tunnel_dev,
      it will go to err_reg_dev and forget to free netdev(sitn->fb_tunnel_dev).
      
      BUG: memory leak
      unreferenced object 0xffff888378daad00 (size 512):
        comm "syz-executor.1", pid 4006, jiffies 4295121142 (age 16.115s)
        hex dump (first 32 bytes):
          00 e6 ed c0 83 88 ff ff 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
          [<00000000d6dcb63e>] kvmalloc include/linux/mm.h:577 [inline]
          [<00000000d6dcb63e>] kvzalloc include/linux/mm.h:585 [inline]
          [<00000000d6dcb63e>] netif_alloc_netdev_queues net/core/dev.c:8380 [inline]
          [<00000000d6dcb63e>] alloc_netdev_mqs+0x600/0xcc0 net/core/dev.c:8970
          [<00000000867e172f>] sit_init_net+0x295/0xa40 net/ipv6/sit.c:1848
          [<00000000871019fa>] ops_init+0xad/0x3e0 net/core/net_namespace.c:129
          [<00000000319507f6>] setup_net+0x2ba/0x690 net/core/net_namespace.c:314
          [<0000000087db4f96>] copy_net_ns+0x1dc/0x330 net/core/net_namespace.c:437
          [<0000000057efc651>] create_new_namespaces+0x382/0x730 kernel/nsproxy.c:107
          [<00000000676f83de>] copy_namespaces+0x2ed/0x3d0 kernel/nsproxy.c:165
          [<0000000030b74bac>] copy_process.part.27+0x231e/0x6db0 kernel/fork.c:1919
          [<00000000fff78746>] copy_process kernel/fork.c:1713 [inline]
          [<00000000fff78746>] _do_fork+0x1bc/0xe90 kernel/fork.c:2224
          [<000000001c2e0d1c>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
          [<00000000ec48bd44>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<0000000039acff8a>] 0xffffffffffffffff
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0bedaac
    • Heiner Kallweit's avatar
      net: phy: phylink: fix uninitialized variable in phylink_get_mac_state · ed7a5441
      Heiner Kallweit authored
      [ Upstream commit d25ed413 ]
      
      When debugging an issue I found implausible values in state->pause.
      Reason in that state->pause isn't initialized and later only single
      bits are changed. Also the struct itself isn't initialized in
      phylink_resolve(). So better initialize state->pause and other
      not yet initialized fields.
      
      v2:
      - use right function name in subject
      v3:
      - initialize additional fields
      
      Fixes: 9525ae83 ("phylink: add phylink infrastructure")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed7a5441
    • Rajasingh Thavamani's avatar
      net: phy: Micrel KSZ8061: link failure after cable connect · d0681689
      Rajasingh Thavamani authored
      [ Upstream commit 232ba3a5 ]
      
      With Micrel KSZ8061 PHY, the link may occasionally not come up after
      Ethernet cable connect. The vendor's (Microchip, former Micrel) errata
      sheet 80000688A.pdf descripes the problem and possible workarounds in
      detail, see below.
      The batch implements workaround 1, which permanently fixes the issue.
      
      DESCRIPTION
      Link-up may not occur properly when the Ethernet cable is initially
      connected. This issue occurs more commonly when the cable is connected
      slowly, but it may occur any time a cable is connected. This issue occurs
      in the auto-negotiation circuit, and will not occur if auto-negotiation
      is disabled (which requires that the two link partners be set to the
      same speed and duplex).
      
      END USER IMPLICATIONS
      When this issue occurs, link is not established. Subsequent cable
      plug/unplaug cycle will not correct the issue.
      
      WORk AROUND
      There are four approaches to work around this issue:
      1. This issue can be prevented by setting bit 15 in MMD device address 1,
         register 2, prior to connecting the cable or prior to setting the
         Restart Auto-negotiation bit in register 0h. The MMD registers are
         accessed via the indirect access registers Dh and Eh, or via the Micrel
         EthUtil utility as shown here:
         . if using the EthUtil utility (usually with a Micrel KSZ8061
           Evaluation Board), type the following commands:
           > address 1
           > mmd 1
           > iw 2 b61a
         . Alternatively, write the following registers to write to the
           indirect MMD register:
           Write register Dh, data 0001h
           Write register Eh, data 0002h
           Write register Dh, data 4001h
           Write register Eh, data B61Ah
      2. The issue can be avoided by disabling auto-negotiation in the KSZ8061,
         either by the strapping option, or by clearing bit 12 in register 0h.
         Care must be taken to ensure that the KSZ8061 and the link partner
         will link with the same speed and duplex. Note that the KSZ8061
         defaults to full-duplex when auto-negotiation is off, but other
         devices may default to half-duplex in the event of failed
         auto-negotiation.
      3. The issue can be avoided by connecting the cable prior to powering-up
         or resetting the KSZ8061, and leaving it plugged in thereafter.
      4. If the above measures are not taken and the problem occurs, link can
         be recovered by setting the Restart Auto-Negotiation bit in
         register 0h, or by resetting or power cycling the device. Reset may
         be either hardware reset or software reset (register 0h, bit 15).
      
      PLAN
      This errata will not be corrected in the future revision.
      
      Fixes: 7ab59dc1 ("drivers/net/phy/micrel_phy: Add support for new PHYs")
      Signed-off-by: default avatarAlexander Onnasch <alexander.onnasch@landisgyr.com>
      Signed-off-by: default avatarRajasingh Thavamani <T.Rajasingh@landisgyr.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0681689
    • YueHaibing's avatar
      net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails · f132b3f5
      YueHaibing authored
      [ Upstream commit 58bdd544 ]
      
      KASAN report this:
      
      BUG: KASAN: null-ptr-deref in nfc_llcp_build_gb+0x37f/0x540 [nfc]
      Read of size 3 at addr 0000000000000000 by task syz-executor.0/5401
      
      CPU: 0 PID: 5401 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xfa/0x1ce lib/dump_stack.c:113
       kasan_report+0x171/0x18d mm/kasan/report.c:321
       memcpy+0x1f/0x50 mm/kasan/common.c:130
       nfc_llcp_build_gb+0x37f/0x540 [nfc]
       nfc_llcp_register_device+0x6eb/0xb50 [nfc]
       nfc_register_device+0x50/0x1d0 [nfc]
       nfcsim_device_new+0x394/0x67d [nfcsim]
       ? 0xffffffffc1080000
       nfcsim_init+0x6b/0x1000 [nfcsim]
       do_one_initcall+0xfa/0x5ca init/main.c:887
       do_init_module+0x204/0x5f6 kernel/module.c:3460
       load_module+0x66b2/0x8570 kernel/module.c:3808
       __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
       do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462e99
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f9cb79dcc58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
      RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003
      RBP: 00007f9cb79dcc70 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9cb79dd6bc
      R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004
      
      nfc_llcp_build_tlv will return NULL on fails, caller should check it,
      otherwise will trigger a NULL dereference.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: eda21f16 ("NFC: Set MIU and RW values from CONNECT and CC LLCP frames")
      Fixes: d646960f ("NFC: Initial LLCP support")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f132b3f5
    • Sheng Lan's avatar
      net: netem: fix skb length BUG_ON in __skb_to_sgvec · d1dd2e15
      Sheng Lan authored
      [ Upstream commit 5845f706 ]
      
      It can be reproduced by following steps:
      1. virtio_net NIC is configured with gso/tso on
      2. configure nginx as http server with an index file bigger than 1M bytes
      3. use tc netem to produce duplicate packets and delay:
         tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90%
      4. continually curl the nginx http server to get index file on client
      5. BUG_ON is seen quickly
      
      [10258690.371129] kernel BUG at net/core/skbuff.c:4028!
      [10258690.371748] invalid opcode: 0000 [#1] SMP PTI
      [10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W         5.0.0-rc6 #2
      [10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202
      [10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea
      [10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002
      [10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028
      [10258690.372094] FS:  0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000
      [10258690.372094] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0
      [10258690.372094] Call Trace:
      [10258690.372094]  <IRQ>
      [10258690.372094]  skb_to_sgvec+0x11/0x40
      [10258690.372094]  start_xmit+0x38c/0x520 [virtio_net]
      [10258690.372094]  dev_hard_start_xmit+0x9b/0x200
      [10258690.372094]  sch_direct_xmit+0xff/0x260
      [10258690.372094]  __qdisc_run+0x15e/0x4e0
      [10258690.372094]  net_tx_action+0x137/0x210
      [10258690.372094]  __do_softirq+0xd6/0x2a9
      [10258690.372094]  irq_exit+0xde/0xf0
      [10258690.372094]  smp_apic_timer_interrupt+0x74/0x140
      [10258690.372094]  apic_timer_interrupt+0xf/0x20
      [10258690.372094]  </IRQ>
      
      In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's
      linear data size and nonlinear data size, thus BUG_ON triggered.
      Because the skb is cloned and a part of nonlinear data is split off.
      
      Duplicate packet is cloned in netem_enqueue() and may be delayed
      some time in qdisc. When qdisc len reached the limit and returns
      NET_XMIT_DROP, the skb will be retransmit later in write queue.
      the skb will be fragmented by tso_fragment(), the limit size
      that depends on cwnd and mss decrease, the skb's nonlinear
      data will be split off. The length of the skb cloned by netem
      will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(),
      the BUG_ON trigger.
      
      To fix it, netem returns NET_XMIT_SUCCESS to upper stack
      when it clones a duplicate packet.
      
      Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()")
      Signed-off-by: default avatarSheng Lan <lansheng@huawei.com>
      Reported-by: default avatarQin Ji <jiqin.ji@huawei.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1dd2e15
    • Paul Moore's avatar
      netlabel: fix out-of-bounds memory accesses · e3713abc
      Paul Moore authored
      [ Upstream commit 5578de48 ]
      
      There are two array out-of-bounds memory accesses, one in
      cipso_v4_map_lvl_valid(), the other in netlbl_bitmap_walk().  Both
      errors are embarassingly simple, and the fixes are straightforward.
      
      As a FYI for anyone backporting this patch to kernels prior to v4.8,
      you'll want to apply the netlbl_bitmap_walk() patch to
      cipso_v4_bitmap_walk() as netlbl_bitmap_walk() doesn't exist before
      Linux v4.8.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Fixes: 446fda4f ("[NetLabel]: CIPSOv4 engine")
      Fixes: 3faa8f98 ("netlabel: Move bitmap manipulation functions to the NetLabel core.")
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3713abc
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Fix u64 statistics · 4afc9831
      Andrew Lunn authored
      [ Upstream commit 6e46e2d8 ]
      
      The switch maintains u64 counters for the number of octets sent and
      received. These are kept as two u32's which need to be combined.  Fix
      the combing, which wrongly worked on u16's.
      
      Fixes: 80c4627b ("dsa: mv88x6xxx: Refactor getting a single statistic")
      Reported-by: default avatarChris Healy <Chris.Healy@zii.aero>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4afc9831
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Fix statistics on mv88e6161 · 05d9f554
      Andrew Lunn authored
      [ Upstream commit a6da21bb ]
      
      Despite what the datesheet says, the silicon implements the older way
      of snapshoting the statistics. Change the op.
      
      Reported-by: Chris.Healy@zii.aero
      Tested-by: Chris.Healy@zii.aero
      Fixes: 0ac64c39 ("net: dsa: mv88e6xxx: mv88e6161 uses mv88e6320 stats snapshot")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05d9f554
    • Bryan Whitehead's avatar
      lan743x: Fix TX Stall Issue · ceb7c249
      Bryan Whitehead authored
      [ Upstream commit 90490ef7 ]
      
      It has been observed that tx queue stalls while downloading
      from certain web sites (example www.speedtest.net)
      
      The cause has been tracked down to a corner case where
      dma descriptors where not setup properly. And there for a tx
      completion interrupt was not signaled.
      
      This fix corrects the problem by properly marking the end of
      a multi descriptor transmission.
      
      Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Signed-off-by: default avatarBryan Whitehead <Bryan.Whitehead@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ceb7c249
    • Hangbin Liu's avatar
      ipv4: Add ICMPv6 support when parse route ipproto · 99ed9458
      Hangbin Liu authored
      [ Upstream commit 5e1a99ea ]
      
      For ip rules, we need to use 'ipproto ipv6-icmp' to match ICMPv6 headers.
      But for ip -6 route, currently we only support tcp, udp and icmp.
      
      Add ICMPv6 support so we can match ipv6-icmp rules for route lookup.
      
      v2: As David Ahern and Sabrina Dubroca suggested, Add an argument to
      rtm_getroute_parse_ip_proto() to handle ICMP/ICMPv6 with different family.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Fixes: eacb9384 ("ipv6: support sport, dport and ip_proto in RTM_GETROUTE")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99ed9458
    • Haiyang Zhang's avatar
      hv_netvsc: Fix IP header checksum for coalesced packets · d61918a5
      Haiyang Zhang authored
      [ Upstream commit bf48648d ]
      
      Incoming packets may have IP header checksum verified by the host.
      They may not have IP header checksum computed after coalescing.
      This patch re-compute the checksum when necessary, otherwise the
      packets may be dropped, because Linux network stack always checks it.
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d61918a5
    • Jiri Benc's avatar
      geneve: correctly handle ipv6.disable module parameter · 36bd44bc
      Jiri Benc authored
      [ Upstream commit cf1c9ccb ]
      
      When IPv6 is compiled but disabled at runtime, geneve_sock_add returns
      -EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole
      operation of bringing up the tunnel.
      
      Ignore failure of IPv6 socket creation for metadata based tunnels caused by
      IPv6 not being available.
      
      This is the same fix as what commit d074bf96 ("vxlan: correctly handle
      ipv6.disable module parameter") is doing for vxlan.
      
      Note there's also commit c0a47e44 ("geneve: should not call rt6_lookup()
      when ipv6 was disabled") which fixes a similar issue but for regular
      tunnels, while this patch is needed for metadata based tunnels.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36bd44bc
    • Michael Chan's avatar
      bnxt_en: Drop oversize TX packets to prevent errors. · 1713c8e1
      Michael Chan authored
      [ Upstream commit 2b3c6885 ]
      
      There have been reports of oversize UDP packets being sent to the
      driver to be transmitted, causing error conditions.  The issue is
      likely caused by the dst of the SKB switching between 'lo' with
      64K MTU and the hardware device with a smaller MTU.  Patches are
      being proposed by Mahesh Bandewar <maheshb@google.com> to fix the
      issue.
      
      In the meantime, add a quick length check in the driver to prevent
      the error.  The driver uses the TX packet size as index to look up an
      array to setup the TX BD.  The array is large enough to support all MTU
      sizes supported by the driver.  The oversize TX packet causes the
      driver to index beyond the array and put garbage values into the
      TX BD.  Add a simple check to prevent this.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1713c8e1
    • Erik Hugne's avatar
      tipc: fix RDM/DGRAM connect() regression · 8d1b9800
      Erik Hugne authored
      [ Upstream commit 0e632089 ]
      
      Fix regression bug introduced in
      commit 365ad353 ("tipc: reduce risk of user starvation during link
      congestion")
      
      Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached
      sockaddr.
      
      Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
      Signed-off-by: default avatarErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d1b9800
    • Ido Schimmel's avatar
      team: Free BPF filter when unregistering netdev · 089100d5
      Ido Schimmel authored
      [ Upstream commit 692c31bd ]
      
      When team is used in loadbalance mode a BPF filter can be used to
      provide a hash which will determine the Tx port.
      
      When the netdev is later unregistered the filter is not freed which
      results in memory leaks [1].
      
      Fix by freeing the program and the corresponding filter when
      unregistering the netdev.
      
      [1]
      unreferenced object 0xffff8881dbc47cc8 (size 16):
        comm "teamd", pid 3068, jiffies 4294997779 (age 438.247s)
        hex dump (first 16 bytes):
          a3 00 6b 6b 6b 6b 6b 6b 88 a5 82 e1 81 88 ff ff  ..kkkkkk........
        backtrace:
          [<000000008a3b47e3>] team_nl_cmd_options_set+0x88f/0x11b0
          [<00000000c4f4f27e>] genl_family_rcv_msg+0x78f/0x1080
          [<00000000610ef838>] genl_rcv_msg+0xca/0x170
          [<00000000a281df93>] netlink_rcv_skb+0x132/0x380
          [<000000004d9448a2>] genl_rcv+0x29/0x40
          [<000000000321b2f4>] netlink_unicast+0x4c0/0x690
          [<000000008c25dffb>] netlink_sendmsg+0x929/0xe10
          [<00000000068298c5>] sock_sendmsg+0xc8/0x110
          [<0000000082a61ff0>] ___sys_sendmsg+0x77a/0x8f0
          [<00000000663ae29d>] __sys_sendmsg+0xf7/0x250
          [<0000000027c5f11a>] do_syscall_64+0x14d/0x610
          [<000000006cfbc8d3>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<00000000e23197e2>] 0xffffffffffffffff
      unreferenced object 0xffff8881e182a588 (size 2048):
        comm "teamd", pid 3068, jiffies 4294997780 (age 438.247s)
        hex dump (first 32 bytes):
          20 00 00 00 02 00 00 00 30 00 00 00 28 f0 ff ff   .......0...(...
          07 00 00 00 00 00 00 00 28 00 00 00 00 00 00 00  ........(.......
        backtrace:
          [<000000002daf01fb>] lb_bpf_func_set+0x45c/0x6d0
          [<000000008a3b47e3>] team_nl_cmd_options_set+0x88f/0x11b0
          [<00000000c4f4f27e>] genl_family_rcv_msg+0x78f/0x1080
          [<00000000610ef838>] genl_rcv_msg+0xca/0x170
          [<00000000a281df93>] netlink_rcv_skb+0x132/0x380
          [<000000004d9448a2>] genl_rcv+0x29/0x40
          [<000000000321b2f4>] netlink_unicast+0x4c0/0x690
          [<000000008c25dffb>] netlink_sendmsg+0x929/0xe10
          [<00000000068298c5>] sock_sendmsg+0xc8/0x110
          [<0000000082a61ff0>] ___sys_sendmsg+0x77a/0x8f0
          [<00000000663ae29d>] __sys_sendmsg+0xf7/0x250
          [<0000000027c5f11a>] do_syscall_64+0x14d/0x610
          [<000000006cfbc8d3>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<00000000e23197e2>] 0xffffffffffffffff
      
      Fixes: 01d7f30a ("team: add loadbalance mode")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarAmit Cohen <amitc@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      089100d5
    • Kai-Heng Feng's avatar
      sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79 · 5e311e53
      Kai-Heng Feng authored
      [ Upstream commit b33b7cd6 ]
      
      Some sky2 chips fire IRQ after S3, before the driver is fully resumed:
      [ 686.804877] do_IRQ: 1.37 No irq handler for vector
      
      This is likely a platform bug that device isn't fully quiesced during
      S3. Use MSI-X, maskable MSI or INTx can prevent this issue from
      happening.
      
      Since MSI-X and maskable MSI are not supported by this device, fallback
      to use INTx on affected platforms.
      
      BugLink: https://bugs.launchpad.net/bugs/1807259
      BugLink: https://bugs.launchpad.net/bugs/1809843Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e311e53
    • Xin Long's avatar
      sctp: call iov_iter_revert() after sending ABORT · 8085d6d0
      Xin Long authored
      [ Upstream commit 901efe12 ]
      
      The user msg is also copied to the abort packet when doing SCTP_ABORT in
      sctp_sendmsg_check_sflags(). When SCTP_SENDALL is set, iov_iter_revert()
      should have been called for sending abort on the next asoc with copying
      this msg. Otherwise, memcpy_from_msg() in sctp_make_abort_user() will
      fail and return error.
      
      Fixes: 49102805 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
      Reported-by: default avatarYing Xu <yinxu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8085d6d0
    • Kristian Evensen's avatar
      qmi_wwan: Add support for Quectel EG12/EM12 · 16a006d7
      Kristian Evensen authored
      [ Upstream commit 822e44b4 ]
      
      Quectel EG12 (module)/EM12 (M.2 card) is a Cat. 12 LTE modem. The modem
      behaves in the same way as the EP06, so the "set DTR"-quirk must be
      applied and the diagnostic-interface check performed. Since the
      diagnostic-check now applies to more modems, I have renamed the function
      from quectel_ep06_diag_detected() to quectel_diag_detected().
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16a006d7
    • YueHaibing's avatar
      net-sysfs: Fix mem leak in netdev_register_kobject · 7ce2a517
      YueHaibing authored
      [ Upstream commit 895a5e96 ]
      
      syzkaller report this:
      BUG: memory leak
      unreferenced object 0xffff88837a71a500 (size 256):
        comm "syz-executor.2", pid 9770, jiffies 4297825125 (age 17.843s)
        hex dump (first 32 bytes):
          00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
          ff ff ff ff ff ff ff ff 20 c0 ef 86 ff ff ff ff  ........ .......
        backtrace:
          [<00000000db12624b>] netdev_register_kobject+0x124/0x2e0 net/core/net-sysfs.c:1751
          [<00000000dc49a994>] register_netdevice+0xcc1/0x1270 net/core/dev.c:8516
          [<00000000e5f3fea0>] tun_set_iff drivers/net/tun.c:2649 [inline]
          [<00000000e5f3fea0>] __tun_chr_ioctl+0x2218/0x3d20 drivers/net/tun.c:2883
          [<000000001b8ac127>] vfs_ioctl fs/ioctl.c:46 [inline]
          [<000000001b8ac127>] do_vfs_ioctl+0x1a5/0x10e0 fs/ioctl.c:690
          [<0000000079b269f8>] ksys_ioctl+0x89/0xa0 fs/ioctl.c:705
          [<00000000de649beb>] __do_sys_ioctl fs/ioctl.c:712 [inline]
          [<00000000de649beb>] __se_sys_ioctl fs/ioctl.c:710 [inline]
          [<00000000de649beb>] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:710
          [<000000007ebded1e>] do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290
          [<00000000db315d36>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<00000000115be9bb>] 0xffffffffffffffff
      
      It should call kset_unregister to free 'dev->queues_kset'
      in error path of register_queue_kobjects, otherwise will cause a mem leak.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: 1d24eb48 ("xps: Transmit Packet Steering")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ce2a517
    • Eric Dumazet's avatar
      net: sched: put back q.qlen into a single location · 3043bfe0
      Eric Dumazet authored
      [ Upstream commit 46b1c18f ]
      
      In the series fc8b81a5 ("Merge branch 'lockless-qdisc-series'")
      John made the assumption that the data path had no need to read
      the qdisc qlen (number of packets in the qdisc).
      
      It is true when pfifo_fast is used as the root qdisc, or as direct MQ/MQPRIO
      children.
      
      But pfifo_fast can be used as leaf in class full qdiscs, and existing
      logic needs to access the child qlen in an efficient way.
      
      HTB breaks badly, since it uses cl->leaf.q->q.qlen in :
        htb_activate() -> WARN_ON()
        htb_dequeue_tree() to decide if a class can be htb_deactivated
        when it has no more packets.
      
      HFSC, DRR, CBQ, QFQ have similar issues, and some calls to
      qdisc_tree_reduce_backlog() also read q.qlen directly.
      
      Using qdisc_qlen_sum() (which iterates over all possible cpus)
      in the data path is a non starter.
      
      It seems we have to put back qlen in a central location,
      at least for stable kernels.
      
      For all qdisc but pfifo_fast, qlen is guarded by the qdisc lock,
      so the existing q.qlen{++|--} are correct.
      
      For 'lockless' qdisc (pfifo_fast so far), we need to use atomic_{inc|dec}()
      because the spinlock might be not held (for example from
      pfifo_fast_enqueue() and pfifo_fast_dequeue())
      
      This patch adds atomic_qlen (in the same location than qlen)
      and renames the following helpers, since we want to express
      they can be used without qdisc lock, and that qlen is no longer percpu.
      
      - qdisc_qstats_cpu_qlen_dec -> qdisc_qstats_atomic_qlen_dec()
      - qdisc_qstats_cpu_qlen_inc -> qdisc_qstats_atomic_qlen_inc()
      
      Later (net-next) we might revert this patch by tracking all these
      qlen uses and replace them by a more efficient method (not having
      to access a precise qlen, but an empty/non_empty status that might
      be less expensive to maintain/track).
      
      Another possibility is to have a legacy pfifo_fast version that would
      be used when used a a child qdisc, since the parent qdisc needs
      a spinlock anyway. But then, future lockless qdiscs would also
      have the same problem.
      
      Fixes: 7e66016f ("net: sched: helpers to sum qlen and qlen for per cpu logic")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3043bfe0
    • Heiner Kallweit's avatar
      net: dsa: mv8e6xxx: fix number of internal PHYs for 88E6x90 family · 0429c9ef
      Heiner Kallweit authored
      [ Upstream commit 95150f29 ]
      
      Ports 9 and 10 don't have internal PHY's but are (dependent on the
      version) SERDES/SGMII/XAUI/RXAUI ports.
      
      v2:
      - fix it for all 88E6x90 family members
      
      Fixes: bc393155 ("net: dsa: mv88e6xxx: Add number of internal PHYs")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0429c9ef
    • Heiner Kallweit's avatar
      net: dsa: mv88e6xxx: handle unknown duplex modes gracefully in mv88e6xxx_port_set_duplex · dea81899
      Heiner Kallweit authored
      [ Upstream commit c6195a8b ]
      
      When testing another issue I faced the problem that
      mv88e6xxx_port_setup_mac() failed due to DUPLEX_UNKNOWN being passed
      as argument to mv88e6xxx_port_set_duplex(). We should handle this case
      gracefully and return -EOPNOTSUPP, like e.g. mv88e6xxx_port_set_speed()
      is doing it.
      
      Fixes: 7f1ae07b ("net: dsa: mv88e6xxx: add port duplex setter")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dea81899
    • Ido Schimmel's avatar
      ip6mr: Do not call __IP6_INC_STATS() from preemptible context · b5ff77dd
      Ido Schimmel authored
      [ Upstream commit 87c11f1d ]
      
      Similar to commit 44f49dd8 ("ipmr: fix possible race resulting from
      improper usage of IP_INC_STATS_BH() in preemptible context."), we cannot
      assume preemption is disabled when incrementing the counter and
      accessing a per-CPU variable.
      
      Preemption can be enabled when we add a route in process context that
      corresponds to packets stored in the unresolved queue, which are then
      forwarded using this route [1].
      
      Fix this by using IP6_INC_STATS() which takes care of disabling
      preemption on architectures where it is needed.
      
      [1]
      [  157.451447] BUG: using __this_cpu_add() in preemptible [00000000] code: smcrouted/2314
      [  157.460409] caller is ip6mr_forward2+0x73e/0x10e0
      [  157.460434] CPU: 3 PID: 2314 Comm: smcrouted Not tainted 5.0.0-rc7-custom-03635-g22f2712113f1 #1336
      [  157.460449] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
      [  157.460461] Call Trace:
      [  157.460486]  dump_stack+0xf9/0x1be
      [  157.460553]  check_preemption_disabled+0x1d6/0x200
      [  157.460576]  ip6mr_forward2+0x73e/0x10e0
      [  157.460705]  ip6_mr_forward+0x9a0/0x1510
      [  157.460771]  ip6mr_mfc_add+0x16b3/0x1e00
      [  157.461155]  ip6_mroute_setsockopt+0x3cb/0x13c0
      [  157.461384]  do_ipv6_setsockopt.isra.8+0x348/0x4060
      [  157.462013]  ipv6_setsockopt+0x90/0x110
      [  157.462036]  rawv6_setsockopt+0x4a/0x120
      [  157.462058]  __sys_setsockopt+0x16b/0x340
      [  157.462198]  __x64_sys_setsockopt+0xbf/0x160
      [  157.462220]  do_syscall_64+0x14d/0x610
      [  157.462349]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 0912ea38 ("[IPV6] MROUTE: Add stats in multicast routing module method ip6_mr_forward().")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarAmit Cohen <amitc@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5ff77dd