1. 06 Jun, 2019 10 commits
    • Zhu Yanjun's avatar
      net: rds: fix memory leak in rds_ib_flush_mr_pool · 85cb9287
      Zhu Yanjun authored
      When the following tests last for several hours, the problem will occur.
      
      Server:
          rds-stress -r 1.1.1.16 -D 1M
      Client:
          rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30
      
      The following will occur.
      
      "
      Starting up....
      tsks   tx/s   rx/s  tx+rx K/s    mbi K/s    mbo K/s tx us/c   rtt us cpu
      %
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
      "
      >From vmcore, we can find that clean_list is NULL.
      
      >From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker.
      Then rds_ib_mr_pool_flush_worker calls
      "
       rds_ib_flush_mr_pool(pool, 0, NULL);
      "
      Then in function
      "
      int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
                               int free_all, struct rds_ib_mr **ibmr_ret)
      "
      ibmr_ret is NULL.
      
      In the source code,
      "
      ...
      list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail);
      if (ibmr_ret)
              *ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode);
      
      /* more than one entry in llist nodes */
      if (clean_nodes->next)
              llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list);
      ...
      "
      When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next
      instead of clean_nodes is added in clean_list.
      So clean_nodes is discarded. It can not be used again.
      The workqueue is executed periodically. So more and more clean_nodes are
      discarded. Finally the clean_list is NULL.
      Then this problem will occur.
      
      Fixes: 1bc144b6 ("net, rds, Replace xlist in net/rds/xlist.h with llist")
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85cb9287
    • David S. Miller's avatar
      Merge branch 'ipv6-fix-EFAULT-on-sendto-with-icmpv6-and-hdrincl' · 8d037f92
      David S. Miller authored
      Olivier Matz says:
      
      ====================
      ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
      
      The following code returns EFAULT (Bad address):
      
        s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
        setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1);
        sendto(ipv6_icmp6_packet, addr);   /* returns -1, errno = EFAULT */
      
      The problem is fixed in the second patch. The first one aligns the
      code to ipv4, to avoid a race condition in the second patch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d037f92
    • Olivier Matz's avatar
      ipv6: fix EFAULT on sendto with icmpv6 and hdrincl · b9aa52c4
      Olivier Matz authored
      The following code returns EFAULT (Bad address):
      
        s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
        setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1);
        sendto(ipv6_icmp6_packet, addr);   /* returns -1, errno = EFAULT */
      
      The IPv4 equivalent code works. A workaround is to use IPPROTO_RAW
      instead of IPPROTO_ICMPV6.
      
      The failure happens because 2 bytes are eaten from the msghdr by
      rawv6_probe_proto_opt() starting from commit 19e3c66b ("ipv6
      equivalent of "ipv4: Avoid reading user iov twice after
      raw_probe_proto_opt""), but at that time it was not a problem because
      IPV6_HDRINCL was not yet introduced.
      
      Only eat these 2 bytes if hdrincl == 0.
      
      Fixes: 715f504b ("ipv6: add IPV6_HDRINCL option for raw sockets")
      Signed-off-by: default avatarOlivier Matz <olivier.matz@6wind.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9aa52c4
    • Olivier Matz's avatar
      ipv6: use READ_ONCE() for inet->hdrincl as in ipv4 · 59e3e4b5
      Olivier Matz authored
      As it was done in commit 8f659a03 ("net: ipv4: fix for a race
      condition in raw_sendmsg") and commit 20b50d79 ("net: ipv4: emulate
      READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the
      value of inet->hdrincl in a local variable, to avoid introducing a race
      condition in the next commit.
      Signed-off-by: default avatarOlivier Matz <olivier.matz@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59e3e4b5
    • Hangbin Liu's avatar
      Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied" · 4970b42d
      Hangbin Liu authored
      This reverts commit e9919a24.
      
      Nathan reported the new behaviour breaks Android, as Android just add
      new rules and delete old ones.
      
      If we return 0 without adding dup rules, Android will remove the new
      added rules and causing system to soft-reboot.
      
      Fixes: e9919a24 ("fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied")
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reported-by: default avatarYaro Slav <yaro330@gmail.com>
      Reported-by: default avatarMaciej Żenczykowski <zenczykowski@gmail.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Tested-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4970b42d
    • Nikita Danilov's avatar
      net: aquantia: fix wol configuration not applied sometimes · 930b9a05
      Nikita Danilov authored
      WoL magic packet configuration sometimes does not work due to
      couple of leakages found.
      
      Mainly there was a regression introduced during readx_poll refactoring.
      
      Next, fw request waiting time was too small. Sometimes that
      caused sleep proxy config function to return with an error
      and to skip WoL configuration.
      At last, WoL data were passed to FW from not clean buffer.
      That could cause FW to accept garbage as a random configuration data.
      
      Fixes: 6a7f2277 ("net: aquantia: replace AQ_HW_WAIT_FOR with readx_poll_timeout_atomic")
      Signed-off-by: default avatarNikita Danilov <nikita.danilov@aquantia.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      930b9a05
    • Vivien Didelot's avatar
      ethtool: fix potential userspace buffer overflow · 0ee4e769
      Vivien Didelot authored
      ethtool_get_regs() allocates a buffer of size ops->get_regs_len(),
      and pass it to the kernel driver via ops->get_regs() for filling.
      
      There is no restriction about what the kernel drivers can or cannot do
      with the open ethtool_regs structure. They usually set regs->version
      and ignore regs->len or set it to the same size as ops->get_regs_len().
      
      But if userspace allocates a smaller buffer for the registers dump,
      we would cause a userspace buffer overflow in the final copy_to_user()
      call, which uses the regs.len value potentially reset by the driver.
      
      To fix this, make this case obvious and store regs.len before calling
      ops->get_regs(), to only copy as much data as requested by userspace,
      up to the value returned by ops->get_regs_len().
      
      While at it, remove the redundant check for non-null regbuf.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ee4e769
    • Neil Horman's avatar
      Fix memory leak in sctp_process_init · 0a8dd9f6
      Neil Horman authored
      syzbot found the following leak in sctp_process_init
      BUG: memory leak
      unreferenced object 0xffff88810ef68400 (size 1024):
        comm "syz-executor273", pid 7046, jiffies 4294945598 (age 28.770s)
        hex dump (first 32 bytes):
          1d de 28 8d de 0b 1b e3 b5 c2 f9 68 fd 1a 97 25  ..(........h...%
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000a02cebbd>] kmemleak_alloc_recursive include/linux/kmemleak.h:55
      [inline]
          [<00000000a02cebbd>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000a02cebbd>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000a02cebbd>] __do_kmalloc mm/slab.c:3658 [inline]
          [<00000000a02cebbd>] __kmalloc_track_caller+0x15d/0x2c0 mm/slab.c:3675
          [<000000009e6245e6>] kmemdup+0x27/0x60 mm/util.c:119
          [<00000000dfdc5d2d>] kmemdup include/linux/string.h:432 [inline]
          [<00000000dfdc5d2d>] sctp_process_init+0xa7e/0xc20
      net/sctp/sm_make_chunk.c:2437
          [<00000000b58b62f8>] sctp_cmd_process_init net/sctp/sm_sideeffect.c:682
      [inline]
          [<00000000b58b62f8>] sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1384
      [inline]
          [<00000000b58b62f8>] sctp_side_effects net/sctp/sm_sideeffect.c:1194
      [inline]
          [<00000000b58b62f8>] sctp_do_sm+0xbdc/0x1d60 net/sctp/sm_sideeffect.c:1165
          [<0000000044e11f96>] sctp_assoc_bh_rcv+0x13c/0x200
      net/sctp/associola.c:1074
          [<00000000ec43804d>] sctp_inq_push+0x7f/0xb0 net/sctp/inqueue.c:95
          [<00000000726aa954>] sctp_backlog_rcv+0x5e/0x2a0 net/sctp/input.c:354
          [<00000000d9e249a8>] sk_backlog_rcv include/net/sock.h:950 [inline]
          [<00000000d9e249a8>] __release_sock+0xab/0x110 net/core/sock.c:2418
          [<00000000acae44fa>] release_sock+0x37/0xd0 net/core/sock.c:2934
          [<00000000963cc9ae>] sctp_sendmsg+0x2c0/0x990 net/sctp/socket.c:2122
          [<00000000a7fc7565>] inet_sendmsg+0x64/0x120 net/ipv4/af_inet.c:802
          [<00000000b732cbd3>] sock_sendmsg_nosec net/socket.c:652 [inline]
          [<00000000b732cbd3>] sock_sendmsg+0x54/0x70 net/socket.c:671
          [<00000000274c57ab>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2292
          [<000000008252aedb>] __sys_sendmsg+0x80/0xf0 net/socket.c:2330
          [<00000000f7bf23d1>] __do_sys_sendmsg net/socket.c:2339 [inline]
          [<00000000f7bf23d1>] __se_sys_sendmsg net/socket.c:2337 [inline]
          [<00000000f7bf23d1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2337
          [<00000000a8b4131f>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:3
      
      The problem was that the peer.cookie value points to an skb allocated
      area on the first pass through this function, at which point it is
      overwritten with a heap allocated value, but in certain cases, where a
      COOKIE_ECHO chunk is included in the packet, a second pass through
      sctp_process_init is made, where the cookie value is re-allocated,
      leaking the first allocation.
      
      Fix is to always allocate the cookie value, and free it when we are done
      using it.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: syzbot+f7e9153b037eac9b1df8@syzkaller.appspotmail.com
      CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a8dd9f6
    • Zhu Yanjun's avatar
      net: rds: fix memory leak when unload rds_rdma · b50e0587
      Zhu Yanjun authored
      When KASAN is enabled, after several rds connections are
      created, then "rmmod rds_rdma" is run. The following will
      appear.
      
      "
      BUG rds_ib_incoming (Not tainted): Objects remaining
      in rds_ib_incoming on __kmem_cache_shutdown()
      
      Call Trace:
       dump_stack+0x71/0xab
       slab_err+0xad/0xd0
       __kmem_cache_shutdown+0x17d/0x370
       shutdown_cache+0x17/0x130
       kmem_cache_destroy+0x1df/0x210
       rds_ib_recv_exit+0x11/0x20 [rds_rdma]
       rds_ib_exit+0x7a/0x90 [rds_rdma]
       __x64_sys_delete_module+0x224/0x2c0
       ? __ia32_sys_delete_module+0x2c0/0x2c0
       do_syscall_64+0x73/0x190
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      "
      This is rds connection memory leak. The root cause is:
      When "rmmod rds_rdma" is run, rds_ib_remove_one will call
      rds_ib_dev_shutdown to drop the rds connections.
      rds_ib_dev_shutdown will call rds_conn_drop to drop rds
      connections as below.
      "
      rds_conn_path_drop(&conn->c_path[0], false);
      "
      In the above, destroy is set to false.
      void rds_conn_path_drop(struct rds_conn_path *cp, bool destroy)
      {
              atomic_set(&cp->cp_state, RDS_CONN_ERROR);
      
              rcu_read_lock();
              if (!destroy && rds_destroy_pending(cp->cp_conn)) {
                      rcu_read_unlock();
                      return;
              }
              queue_work(rds_wq, &cp->cp_down_w);
              rcu_read_unlock();
      }
      In the above function, destroy is set to false. rds_destroy_pending
      is called. This does not move rds connections to ib_nodev_conns.
      So destroy is set to true to move rds connections to ib_nodev_conns.
      In rds_ib_unregister_client, flush_workqueue is called to make rds_wq
      finsh shutdown rds connections. The function rds_ib_destroy_nodev_conns
      is called to shutdown rds connections finally.
      Then rds_ib_recv_exit is called to destroy slab.
      
      void rds_ib_recv_exit(void)
      {
              kmem_cache_destroy(rds_ib_incoming_slab);
              kmem_cache_destroy(rds_ib_frag_slab);
      }
      The above slab memory leak will not occur again.
      
      >From tests,
      256 rds connections
      [root@ca-dev14 ~]# time rmmod rds_rdma
      
      real    0m16.522s
      user    0m0.000s
      sys     0m8.152s
      512 rds connections
      [root@ca-dev14 ~]# time rmmod rds_rdma
      
      real    0m32.054s
      user    0m0.000s
      sys     0m15.568s
      
      To rmmod rds_rdma with 256 rds connections, about 16 seconds are needed.
      And with 512 rds connections, about 32 seconds are needed.
      >From ftrace, when one rds connection is destroyed,
      
      "
       19)               |  rds_conn_destroy [rds]() {
       19)   7.782 us    |    rds_conn_path_drop [rds]();
       15)               |  rds_shutdown_worker [rds]() {
       15)               |    rds_conn_shutdown [rds]() {
       15)   1.651 us    |      rds_send_path_reset [rds]();
       15)   7.195 us    |    }
       15) + 11.434 us   |  }
       19)   2.285 us    |    rds_cong_remove_conn [rds]();
       19) * 24062.76 us |  }
      "
      So if many rds connections will be destroyed, this function
      rds_ib_destroy_nodev_conns uses most of time.
      Suggested-by: default avatarHåkon Bugge <haakon.bugge@oracle.com>
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b50e0587
    • Xin Long's avatar
      ipv6: fix the check before getting the cookie in rt6_get_cookie · b7999b07
      Xin Long authored
      In Jianlin's testing, netperf was broken with 'Connection reset by peer',
      as the cookie check failed in rt6_check() and ip6_dst_check() always
      returned NULL.
      
      It's caused by Commit 93531c67 ("net/ipv6: separate handling of FIB
      entries from dst based routes"), where the cookie can be got only when
      'c1'(see below) for setting dst_cookie whereas rt6_check() is called
      when !'c1' for checking dst_cookie, as we can see in ip6_dst_check().
      
      Since in ip6_dst_check() both rt6_dst_from_check() (c1) and rt6_check()
      (!c1) will check the 'from' cookie, this patch is to remove the c1 check
      in rt6_get_cookie(), so that the dst_cookie can always be set properly.
      
      c1:
        (rt->rt6i_flags & RTF_PCPU || unlikely(!list_empty(&rt->rt6i_uncached)))
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7999b07
  2. 05 Jun, 2019 9 commits
    • Xin Long's avatar
      ipv4: not do cache for local delivery if bc_forwarding is enabled · 0a90478b
      Xin Long authored
      With the topo:
      
          h1 ---| rp1            |
                |     route  rp3 |--- h3 (192.168.200.1)
          h2 ---| rp2            |
      
      If rp1 bc_forwarding is set while rp2 bc_forwarding is not, after
      doing "ping 192.168.200.255" on h1, then ping 192.168.200.255 on
      h2, and the packets can still be forwared.
      
      This issue was caused by the input route cache. It should only do
      the cache for either bc forwarding or local delivery. Otherwise,
      local delivery can use the route cache for bc forwarding of other
      interfaces.
      
      This patch is to fix it by not doing cache for local delivery if
      all.bc_forwarding is enabled.
      
      Note that we don't fix it by checking route cache local flag after
      rt_cache_valid() in "local_input:" and "ip_mkroute_input", as the
      common route code shouldn't be touched for bc_forwarding.
      
      Fixes: 5cbf777c ("route: add support for directed broadcast forwarding")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a90478b
    • David S. Miller's avatar
      Merge branch 's390-qeth-fixes' · e7a9fe7b
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2019-06-05
      
      one more shot...  now with patch 2 fixed up so that it uses the
      dst entry returned from dst_check().
      
      From the v1 cover letter:
      
      Please apply the following set of qeth fixes to -net.
      
      - The first two patches fix issues in the L3 driver's cast type
        selection for transmitted skbs.
      - Alexandra adds a sanity check when retrieving VLAN information from
        neighbour address events.
      - The last patch adds some missing error handling for qeth's new
        multiqueue code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7a9fe7b
    • Julian Wiedmann's avatar
      s390/qeth: handle error when updating TX queue count · bd966839
      Julian Wiedmann authored
      netif_set_real_num_tx_queues() can return an error, deal with it.
      
      Fixes: 73dc2daf ("s390/qeth: add TX multiqueue support for OSA devices")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd966839
    • Alexandra Winter's avatar
      s390/qeth: fix VLAN attribute in bridge_hostnotify udev event · 33572619
      Alexandra Winter authored
      Enabling sysfs attribute bridge_hostnotify triggers a series of udev events
      for the MAC addresses of all currently connected peers. In case no VLAN is
      set for a peer, the device reports the corresponding MAC addresses with
      VLAN ID 4096. This currently results in attribute VLAN=4096 for all
      non-VLAN interfaces in the initial series of events after host-notify is
      enabled.
      
      Instead, no VLAN attribute should be reported in the udev event for
      non-VLAN interfaces.
      
      Only the initial events face this issue. For dynamic changes that are
      reported later, the device uses a validity flag.
      
      This also changes the code so that it now sets the VLAN attribute for
      MAC addresses with VID 0. On Linux, no qeth interface will ever be
      registered with VID 0: Linux kernel registers VID 0 on all network
      interfaces initially, but qeth will drop .ndo_vlan_rx_add_vid for VID 0.
      Peers with other OSs could register MACs with VID 0.
      
      Fixes: 9f48b9db ("qeth: bridgeport support - address notifications")
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33572619
    • Julian Wiedmann's avatar
      s390/qeth: check dst entry before use · 0cd6783d
      Julian Wiedmann authored
      While qeth_l3 uses netif_keep_dst() to hold onto the dst, a skb's dst
      may still have been obsoleted (via dst_dev_put()) by the time that we
      end up using it. The dst then points to the loopback interface, which
      means the neighbour lookup in qeth_l3_get_cast_type() determines a bogus
      cast type of RTN_BROADCAST.
      For IQD interfaces this causes us to place such skbs on the wrong
      HW queue, resulting in TX errors.
      
      Fix-up the various call sites to first validate the dst entry with
      dst_check(), and fall back accordingly.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0cd6783d
    • Julian Wiedmann's avatar
      s390/qeth: handle limited IPv4 broadcast in L3 TX path · 72c87976
      Julian Wiedmann authored
      When selecting the cast type of a neighbourless IPv4 skb (eg. on a raw
      socket), qeth_l3 falls back to the packet's destination IP address.
      For this case we should classify traffic sent to 255.255.255.255 as
      broadcast.
      This fixes DHCP requests, which were misclassified as unicast
      (and for IQD interfaces thus ended up on the wrong HW queue).
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72c87976
    • Paolo Abeni's avatar
      net: fix indirect calls helpers for ptype list hooks. · fdf71426
      Paolo Abeni authored
      As Eric noted, the current wrapper for ptype func hook inside
      __netif_receive_skb_list_ptype() has no chance of avoiding the indirect
      call: we enter such code path only for protocols other than ipv4 and
      ipv6.
      
      Instead we can wrap the list_func invocation.
      
      v1 -> v2:
       - use the correct fix tag
      
      Fixes: f5737cba ("net: use indirect calls helpers for ptype hook")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEdward Cree <ecree@solarflare.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdf71426
    • Miaohe Lin's avatar
      net: ipvlan: Fix ipvlan device tso disabled while NETIF_F_IP_CSUM is set · ceae266b
      Miaohe Lin authored
      There's some NICs, such as hinic, with NETIF_F_IP_CSUM and NETIF_F_TSO
      on but NETIF_F_HW_CSUM off. And ipvlan device features will be
      NETIF_F_TSO on with NETIF_F_IP_CSUM and NETIF_F_IP_CSUM both off as
      IPVLAN_FEATURES only care about NETIF_F_HW_CSUM. So TSO will be
      disabled in netdev_fix_features.
      For example:
      Features for enp129s0f0:
      rx-checksumming: on
      tx-checksumming: on
              tx-checksum-ipv4: on
              tx-checksum-ip-generic: off [fixed]
              tx-checksum-ipv6: on
      
      Fixes: a188222b ("net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ceae266b
    • Tim Beale's avatar
      udp: only choose unbound UDP socket for multicast when not in a VRF · 82ba25c6
      Tim Beale authored
      By default, packets received in another VRF should not be passed to an
      unbound socket in the default VRF. This patch updates the IPv4 UDP
      multicast logic to match the unicast VRF logic (in compute_score()),
      as well as the IPv6 mcast logic (in __udp_v6_is_mcast_sock()).
      
      The particular case I noticed was DHCP discover packets going
      to the 255.255.255.255 address, which are handled by
      __udp4_lib_mcast_deliver(). The previous code meant that running
      multiple different DHCP server or relay agent instances across VRFs
      did not work correctly - any server/relay agent in the default VRF
      received DHCP discover packets for all other VRFs.
      
      Fixes: 6da5b0f0 ("net: ensure unbound datagram socket to be chosen when not in a VRF")
      Signed-off-by: default avatarTim Beale <timbeale@catalyst.net.nz>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82ba25c6
  3. 04 Jun, 2019 5 commits
  4. 03 Jun, 2019 6 commits
  5. 02 Jun, 2019 3 commits
  6. 31 May, 2019 6 commits
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Don't store frame type in skb->cb · e8d67fa5
      Vladimir Oltean authored
      Due to a confusion I thought that eth_type_trans() was called by the
      network stack whereas it can actually be called by network drivers to
      figure out the skb protocol and next packet_type handlers.
      
      In light of the above, it is not safe to store the frame type from the
      DSA tagger's .filter callback (first entry point on RX path), since GRO
      is yet to be invoked on the received traffic.  Hence it is very likely
      that the skb->cb will actually get overwritten between eth_type_trans()
      and the actual DSA packet_type handler.
      
      Of course, what this patch fixes is the actual overwriting of the
      SJA1105_SKB_CB(skb)->type field from the GRO layer, which made all
      frames be seen as SJA1105_FRAME_TYPE_NORMAL (0).
      
      Fixes: 227d07a0 ("net: dsa: sja1105: Add support for traffic through standalone ports")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8d67fa5
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 036e3431
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix OOPS during nf_tables rule dump, from Florian Westphal.
      
       2) Use after free in ip_vs_in, from Yue Haibing.
      
       3) Fix various kTLS bugs (NULL deref during device removal resync,
          netdev notification ignoring, etc.) From Jakub Kicinski.
      
       4) Fix ipv6 redirects with VRF, from David Ahern.
      
       5) Memory leak fix in igmpv3_del_delrec(), from Eric Dumazet.
      
       6) Missing memory allocation failure check in ip6_ra_control(), from
          Gen Zhang. And likewise fix ip_ra_control().
      
       7) TX clean budget logic error in aquantia, from Igor Russkikh.
      
       8) SKB leak in llc_build_and_send_ui_pkt(), from Eric Dumazet.
      
       9) Double frees in mlx5, from Parav Pandit.
      
      10) Fix lost MAC address in r8169 during PCI D3, from Heiner Kallweit.
      
      11) Fix botched register access in mvpp2, from Antoine Tenart.
      
      12) Use after free in napi_gro_frags(), from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (89 commits)
        net: correct zerocopy refcnt with udp MSG_MORE
        ethtool: Check for vlan etype or vlan tci when parsing flow_rule
        net: don't clear sock->sk early to avoid trouble in strparser
        net-gro: fix use-after-free read in napi_gro_frags()
        net: dsa: tag_8021q: Create a stable binary format
        net: dsa: tag_8021q: Change order of rx_vid setup
        net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
        ipv4: tcp_input: fix stack out of bounds when parsing TCP options.
        mlxsw: spectrum: Prevent force of 56G
        mlxsw: spectrum_acl: Avoid warning after identical rules insertion
        net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT
        r8169: fix MAC address being lost in PCI D3
        net: core: support XDP generic on stacked devices.
        netvsc: unshare skb in VF rx handler
        udp: Avoid post-GRO UDP checksum recalculation
        net: phy: dp83867: Set up RGMII TX delay
        net: phy: dp83867: do not call config_init twice
        net: phy: dp83867: increase SGMII autoneg timer duration
        net: phy: dp83867: fix speed 10 in sgmii mode
        net: phy: marvell10g: report if the PHY fails to boot firmware
        ...
      036e3431
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · adc3f554
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "The fixes are still trickling in for arm64, but the only really
        significant one here is actually fixing a regression in the botched
        module relocation range checking merged for -rc2.
      
        Hopefully we've nailed it this time.
      
         - Fix implementation of our set_personality() system call, which
           wasn't being wrapped properly
      
         - Fix system call function types to keep CFI happy
      
         - Fix siginfo layout when delivering SIGKILL after a kernel fault
      
         - Really fix module relocation range checking"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: use the correct function type for __arm64_sys_ni_syscall
        arm64: use the correct function type in SYSCALL_DEFINE0
        arm64: fix syscall_fn_t type
        signal/arm64: Use force_sig not force_sig_fault for SIGKILL
        arm64/module: revert to unsigned interpretation of ABS16/32 relocations
        arm64: Fix the arm64_personality() syscall wrapper redirection
      adc3f554
    • Linus Torvalds's avatar
      Merge tag 'for-5.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 318adf8e
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A few more fixes for bugs reported by users, fuzzing tools and
        regressions:
      
         - fix crashes in relocation:
             + resuming interrupted balance operation does not properly clean
               up orphan trees
             + with enabled qgroups, resuming needs to be more careful about
               block groups due to limited context when updating qgroups
      
         - fsync and logging fixes found by fuzzing
      
         - incremental send fixes for no-holes and clone
      
         - fix spin lock type used in timer function for zstd"
      
      * tag 'for-5.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix race updating log root item during fsync
        Btrfs: fix wrong ctime and mtime of a directory after log replay
        Btrfs: fix fsync not persisting changed attributes of a directory
        btrfs: qgroup: Check bg while resuming relocation to avoid NULL pointer dereference
        btrfs: reloc: Also queue orphan reloc tree for cleanup to avoid BUG_ON()
        Btrfs: incremental send, fix emission of invalid clone operations
        Btrfs: incremental send, fix file corruption when no-holes feature is enabled
        btrfs: correct zstd workspace manager lock to use spin_lock_bh()
        btrfs: Ensure replaced device doesn't have pending chunk allocation
      318adf8e
    • Linus Torvalds's avatar
      Merge tag 'configfs-for-5.2-2' of git://git.infradead.org/users/hch/configfs · 8cb7104d
      Linus Torvalds authored
      Pull configs fix from Christoph Hellwig:
      
       - fix a use after free in configfs_d_iput (Sahitya Tummala)
      
      * tag 'configfs-for-5.2-2' of git://git.infradead.org/users/hch/configfs:
        configfs: Fix use-after-free when accessing sd->s_dentry
      8cb7104d
    • Linus Torvalds's avatar
      Merge tag 'sound-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · c5ba1712
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "No big surprises here, just a few device-specific fixes.
      
        HD-audio received several fixes for Acer, Dell, Huawei and other
        laptops as well as the workaround for the new Intel chipset. One
        significant one-liner fix is the disablement of the node-power saving
        on Realtek codecs, which may potentially cover annoying bugs like the
        background noises or click noises on many devices.
      
        Other than that, a fix for FireWire bit definitions, and another fix
        for LINE6 USB audio bug that was discovered by syzkaller"
      
      * tag 'sound-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: fireface: Use ULL suffixes for 64-bit constants
        ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops
        ALSA: line6: Assure canceling delayed work at disconnection
        ALSA: hda - Force polling mode on CNL for fixing codec communication
        ALSA: hda/realtek - Enable micmute LED for Huawei laptops
        ALSA: hda/realtek - Set default power save node to 0
        ALSA: hda/realtek - Check headset type by unplug and resume
      c5ba1712
  7. 30 May, 2019 1 commit