1. 21 Feb, 2019 5 commits
    • Magnus Karlsson's avatar
      ixgbe: fix potential RX buffer starvation for AF_XDP · 4a9b32f3
      Magnus Karlsson authored
      When the RX rings are created they are also populated with buffers so
      that packets can be received. Usually these are kernel buffers, but
      for AF_XDP in zero-copy mode, these are user-space buffers and in this
      case the application might not have sent down any buffers to the
      driver at this point. And if no buffers are allocated at ring creation
      time, no packets can be received and no interrupts will be generated so
      the NAPI poll function that allocates buffers to the rings will never
      get executed.
      
      To rectify this, we kick the NAPI context of any queue with an
      attached AF_XDP zero-copy socket in two places in the code. Once after
      an XDP program has loaded and once after the umem is registered.  This
      take care of both cases: XDP program gets loaded first then AF_XDP
      socket is created, and the reverse, AF_XDP socket is created first,
      then XDP program is loaded.
      
      Fixes: d0bcacd0 ("ixgbe: add AF_XDP zero-copy Rx support")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4a9b32f3
    • Magnus Karlsson's avatar
      i40e: fix potential RX buffer starvation for AF_XDP · 14ffeb52
      Magnus Karlsson authored
      When the RX rings are created they are also populated with buffers
      so that packets can be received. Usually these are kernel buffers,
      but for AF_XDP in zero-copy mode, these are user-space buffers and
      in this case the application might not have sent down any buffers
      to the driver at this point. And if no buffers are allocated at ring
      creation time, no packets can be received and no interrupts will be
      generated so the NAPI poll function that allocates buffers to the
      rings will never get executed.
      
      To rectify this, we kick the NAPI context of any queue with an
      attached AF_XDP zero-copy socket in two places in the code. Once
      after an XDP program has loaded and once after the umem is registered.
      This take care of both cases: XDP program gets loaded first then AF_XDP
      socket is created, and the reverse, AF_XDP socket is created first,
      then XDP program is loaded.
      
      Fixes: 0a714186 ("i40e: add AF_XDP zero-copy Rx support")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      14ffeb52
    • Jeff Kirsher's avatar
      ixgbe: fix older devices that do not support IXGBE_MRQC_L3L4TXSWEN · 156a67a9
      Jeff Kirsher authored
      The enabling L3/L4 filtering for transmit switched packets for all
      devices caused unforeseen issue on older devices when trying to send UDP
      traffic in an ordered sequence.  This bit was originally intended for X550
      devices, which supported this feature, so limit the scope of this bit to
      only X550 devices.
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      156a67a9
    • Al Viro's avatar
      missing barriers in some of unix_sock ->addr and ->path accesses · ae3b5641
      Al Viro authored
      Several u->addr and u->path users are not holding any locks in
      common with unix_bind().  unix_state_lock() is useless for those
      purposes.
      
      u->addr is assign-once and *(u->addr) is fully set up by the time
      we set u->addr (all under unix_table_lock).  u->path is also
      set in the same critical area, also before setting u->addr, and
      any unix_sock with ->path filled will have non-NULL ->addr.
      
      So setting ->addr with smp_store_release() is all we need for those
      "lockless" users - just have them fetch ->addr with smp_load_acquire()
      and don't even bother looking at ->path if they see NULL ->addr.
      
      Users of ->addr and ->path fall into several classes now:
          1) ones that do smp_load_acquire(u->addr) and access *(u->addr)
      and u->path only if smp_load_acquire() has returned non-NULL.
          2) places holding unix_table_lock.  These are guaranteed that
      *(u->addr) is seen fully initialized.  If unix_sock is in one of the
      "bound" chains, so's ->path.
          3) unix_sock_destructor() using ->addr is safe.  All places
      that set u->addr are guaranteed to have seen all stores *(u->addr)
      while holding a reference to u and unix_sock_destructor() is called
      when (atomic) refcount hits zero.
          4) unix_release_sock() using ->path is safe.  unix_bind()
      is serialized wrt unix_release() (normally - by struct file
      refcount), and for the instances that had ->path set by unix_bind()
      unix_release_sock() comes from unix_release(), so they are fine.
      Instances that had it set in unix_stream_connect() either end up
      attached to a socket (in unix_accept()), in which case the call
      chain to unix_release_sock() and serialization are the same as in
      the previous case, or they never get accept'ed and unix_release_sock()
      is called when the listener is shut down and its queue gets purged.
      In that case the listener's queue lock provides the barriers needed -
      unix_stream_connect() shoves our unix_sock into listener's queue
      under that lock right after having set ->path and eventual
      unix_release_sock() caller picks them from that queue under the
      same lock right before calling unix_release_sock().
          5) unix_find_other() use of ->path is pointless, but safe -
      it happens with successful lookup by (abstract) name, so ->path.dentry
      is guaranteed to be NULL there.
      earlier-variant-reviewed-by: default avatar"Paul E. McKenney" <paulmck@linux.ibm.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae3b5641
    • Russell King's avatar
      net: marvell: mvneta: fix DMA debug warning · a8fef9ba
      Russell King authored
      Booting 4.20 on SolidRun Clearfog issues this warning with DMA API
      debug enabled:
      
      WARNING: CPU: 0 PID: 555 at kernel/dma/debug.c:1230 check_sync+0x514/0x5bc
      mvneta f1070000.ethernet: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000002dd7dc00] [size=240 bytes]
      Modules linked in: ahci mv88e6xxx dsa_core xhci_plat_hcd xhci_hcd devlink armada_thermal marvell_cesa des_generic ehci_orion phy_armada38x_comphy mcp3021 spi_orion evbug sfp mdio_i2c ip_tables x_tables
      CPU: 0 PID: 555 Comm: bridge-network- Not tainted 4.20.0+ #291
      Hardware name: Marvell Armada 380/385 (Device Tree)
      [<c0019638>] (unwind_backtrace) from [<c0014888>] (show_stack+0x10/0x14)
      [<c0014888>] (show_stack) from [<c07f54e0>] (dump_stack+0x9c/0xd4)
      [<c07f54e0>] (dump_stack) from [<c00312bc>] (__warn+0xf8/0x124)
      [<c00312bc>] (__warn) from [<c00313b0>] (warn_slowpath_fmt+0x38/0x48)
      [<c00313b0>] (warn_slowpath_fmt) from [<c00b0370>] (check_sync+0x514/0x5bc)
      [<c00b0370>] (check_sync) from [<c00b04f8>] (debug_dma_sync_single_range_for_cpu+0x6c/0x74)
      [<c00b04f8>] (debug_dma_sync_single_range_for_cpu) from [<c051bd14>] (mvneta_poll+0x298/0xf58)
      [<c051bd14>] (mvneta_poll) from [<c0656194>] (net_rx_action+0x128/0x424)
      [<c0656194>] (net_rx_action) from [<c000a230>] (__do_softirq+0xf0/0x540)
      [<c000a230>] (__do_softirq) from [<c00386e0>] (irq_exit+0x124/0x144)
      [<c00386e0>] (irq_exit) from [<c009b5e0>] (__handle_domain_irq+0x58/0xb0)
      [<c009b5e0>] (__handle_domain_irq) from [<c03a63c4>] (gic_handle_irq+0x48/0x98)
      [<c03a63c4>] (gic_handle_irq) from [<c0009a10>] (__irq_svc+0x70/0x98)
      ...
      
      This appears to be caused by mvneta_rx_hwbm() calling
      dma_sync_single_range_for_cpu() with the wrong struct device pointer,
      as the buffer manager device pointer is used to map and unmap the
      buffer.  Fix this.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8fef9ba
  2. 20 Feb, 2019 2 commits
    • Russell King's avatar
      net: dsa: fix unintended change of bridge interface STP state · 9c2054a5
      Russell King authored
      When a DSA port is added to a bridge and brought up, the resulting STP
      state programmed into the hardware depends on the order that these
      operations are performed.  However, the Linux bridge code believes that
      the port is in disabled mode.
      
      If the DSA port is first added to a bridge and then brought up, it will
      be in blocking mode.  If it is brought up and then added to the bridge,
      it will be in disabled mode.
      
      This difference is caused by DSA always setting the STP mode in
      dsa_port_enable() whether or not this port is part of a bridge.  Since
      bridge always sets the STP state when the port is added, brought up or
      taken down, it is unnecessary for us to manipulate the STP state.
      
      Apparently, this code was copied from Rocker, and the very next day a
      similar fix for Rocker was merged but was not propagated to DSA.  See
      e47172ab ("rocker: put port in FORWADING state after leaving bridge")
      
      Fixes: b73adef6 ("net: dsa: integrate with SWITCHDEV for HW bridging")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c2054a5
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 40e196a9
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix suspend and resume in mt76x0u USB driver, from Stanislaw
          Gruszka.
      
       2) Missing memory barriers in xsk, from Magnus Karlsson.
      
       3) rhashtable fixes in mac80211 from Herbert Xu.
      
       4) 32-bit MIPS eBPF JIT fixes from Paul Burton.
      
       5) Fix for_each_netdev_feature() on big endian, from Hauke Mehrtens.
      
       6) GSO validation fixes from Willem de Bruijn.
      
       7) Endianness fix for dwmac4 timestamp handling, from Alexandre Torgue.
      
       8) More strict checks in tcp_v4_err(), from Eric Dumazet.
      
       9) af_alg_release should NULL out the sk after the sock_put(), from Mao
          Wenan.
      
      10) Missing unlock in mac80211 mesh error path, from Wei Yongjun.
      
      11) Missing device put in hns driver, from Salil Mehta.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        sky2: Increase D3 delay again
        vhost: correctly check the return value of translate_desc() in log_used()
        net: netcp: Fix ethss driver probe issue
        net: hns: Fixes the missing put_device in positive leg for roce reset
        net: stmmac: Fix a race in EEE enable callback
        qed: Fix iWARP syn packet mac address validation.
        qed: Fix iWARP buffer size provided for syn packet processing.
        r8152: Add support for MAC address pass through on RTL8153-BD
        mac80211: mesh: fix missing unlock on error in table_path_del()
        net/mlx4_en: fix spelling mistake: "quiting" -> "quitting"
        net: crypto set sk to NULL when af_alg_release.
        net: Do not allocate page fragments that are not skb aligned
        mm: Use fixed constant in page_frag_alloc instead of size + 1
        tcp: tcp_v4_err() should be more careful
        tcp: clear icsk_backoff in tcp_write_queue_purge()
        net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
        qmi_wwan: apply SET_DTR quirk to Sierra WP7607
        net: stmmac: handle endianness in dwmac4_get_timestamp
        doc: Mention MSG_ZEROCOPY implementation for UDP
        mlxsw: __mlxsw_sp_port_headroom_set(): Fix a use of local variable
        ...
      40e196a9
  3. 19 Feb, 2019 13 commits
  4. 18 Feb, 2019 8 commits
    • Colin Ian King's avatar
      net/mlx4_en: fix spelling mistake: "quiting" -> "quitting" · 21d2cb49
      Colin Ian King authored
      There is a spelling mistake in a en_err error message. Fix it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21d2cb49
    • Mao Wenan's avatar
      net: crypto set sk to NULL when af_alg_release. · 9060cb71
      Mao Wenan authored
      KASAN has found use-after-free in sockfs_setattr.
      The existed commit 6d8c50dc ("socket: close race condition between sock_close()
      and sockfs_setattr()") is to fix this simillar issue, but it seems to ignore
      that crypto module forgets to set the sk to NULL after af_alg_release.
      
      KASAN report details as below:
      BUG: KASAN: use-after-free in sockfs_setattr+0x120/0x150
      Write of size 4 at addr ffff88837b956128 by task syz-executor0/4186
      
      CPU: 2 PID: 4186 Comm: syz-executor0 Not tainted xxx + #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       dump_stack+0xca/0x13e
       print_address_description+0x79/0x330
       ? vprintk_func+0x5e/0xf0
       kasan_report+0x18a/0x2e0
       ? sockfs_setattr+0x120/0x150
       sockfs_setattr+0x120/0x150
       ? sock_register+0x2d0/0x2d0
       notify_change+0x90c/0xd40
       ? chown_common+0x2ef/0x510
       chown_common+0x2ef/0x510
       ? chmod_common+0x3b0/0x3b0
       ? __lock_is_held+0xbc/0x160
       ? __sb_start_write+0x13d/0x2b0
       ? __mnt_want_write+0x19a/0x250
       do_fchownat+0x15c/0x190
       ? __ia32_sys_chmod+0x80/0x80
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       __x64_sys_fchownat+0xbf/0x160
       ? lockdep_hardirqs_on+0x39a/0x5e0
       do_syscall_64+0xc8/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462589
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89
      f7 48 89 d6 48 89
      ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3
      48 c7 c1 bc ff ff
      ff f7 d8 64 89 01 48
      RSP: 002b:00007fb4b2c83c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000104
      RAX: ffffffffffffffda RBX: 000000000072bfa0 RCX: 0000000000462589
      RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000007
      RBP: 0000000000000005 R08: 0000000000001000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4b2c846bc
      R13: 00000000004bc733 R14: 00000000006f5138 R15: 00000000ffffffff
      
      Allocated by task 4185:
       kasan_kmalloc+0xa0/0xd0
       __kmalloc+0x14a/0x350
       sk_prot_alloc+0xf6/0x290
       sk_alloc+0x3d/0xc00
       af_alg_accept+0x9e/0x670
       hash_accept+0x4a3/0x650
       __sys_accept4+0x306/0x5c0
       __x64_sys_accept4+0x98/0x100
       do_syscall_64+0xc8/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 4184:
       __kasan_slab_free+0x12e/0x180
       kfree+0xeb/0x2f0
       __sk_destruct+0x4e6/0x6a0
       sk_destruct+0x48/0x70
       __sk_free+0xa9/0x270
       sk_free+0x2a/0x30
       af_alg_release+0x5c/0x70
       __sock_release+0xd3/0x280
       sock_close+0x1a/0x20
       __fput+0x27f/0x7f0
       task_work_run+0x136/0x1b0
       exit_to_usermode_loop+0x1a7/0x1d0
       do_syscall_64+0x461/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Syzkaller reproducer:
      r0 = perf_event_open(&(0x7f0000000000)={0x0, 0x70, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0x0,
      0xffffffffffffffff, 0x0)
      r1 = socket$alg(0x26, 0x5, 0x0)
      getrusage(0x0, 0x0)
      bind(r1, &(0x7f00000001c0)=@alg={0x26, 'hash\x00', 0x0, 0x0,
      'sha256-ssse3\x00'}, 0x80)
      r2 = accept(r1, 0x0, 0x0)
      r3 = accept4$unix(r2, 0x0, 0x0, 0x0)
      r4 = dup3(r3, r0, 0x0)
      fchownat(r4, &(0x7f00000000c0)='\x00', 0x0, 0x0, 0x1000)
      
      Fixes: 6d8c50dc ("socket: close race condition between sock_close() and sockfs_setattr()")
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9060cb71
    • Linus Torvalds's avatar
      Merge tag 'mailbox-fixes-v5.0-rc7' of... · 301e3610
      Linus Torvalds authored
      Merge tag 'mailbox-fixes-v5.0-rc7' of git://git.linaro.org/landing-teams/working/fujitsu/integration
      
      Pull mailbox fixes from Jassi Brar:
      
       - API: Fix build breakge by exporting the function mbox_flush
      
       - BRCM: Fix FlexRM ring flush timeout issue
      
      * tag 'mailbox-fixes-v5.0-rc7' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush timeout issue
        mailbox: Export mbox_flush()
      301e3610
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 3ddc14e2
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A few ARM fixes:
      
         - Dietmar Eggemann noticed an issue with IRQ migration during CPU
           hotplug stress testing.
      
         - Mathieu Desnoyers noticed that a previous fix broke optimised
           kprobes.
      
         - Robin Murphy noticed a case where we were not clearing the dma_ops"
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 8835/1: dma-mapping: Clear DMA ops on teardown
        ARM: 8834/1: Fix: kprobes: optimized kprobes illegal instruction
        ARM: 8824/1: fix a migrating irq bug when hotplug cpu
      3ddc14e2
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.0-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 10f49021
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Two more tracing fixes
      
         - Have kprobes not use copy_from_user() to access kernel addresses,
           because kprobes can legitimately poke at bad kernel memory, which
           will fault. Copy from user code should never fault in kernel space.
           Using probe_mem_read() can handle kernel address space faulting.
      
         - Put back the entries counter in the tracing output that was
           accidentally removed"
      
      * tag 'trace-v5.0-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix number of entries in trace header
        kprobe: Do not use uaccess functions to access kernel memory that can fault
      10f49021
    • Rayagonda Kokatanur's avatar
      mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush timeout issue · d7bf31a0
      Rayagonda Kokatanur authored
      RING_CONTROL reg was not written due to wrong address, hence all
      the subsequent ring flush was timing out.
      
      Fixes: a371c10e ("mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush sequence")
      Signed-off-by: default avatarRayagonda Kokatanur <rayagonda.kokatanur@broadcom.com>
      Signed-off-by: default avatarRay Jui <ray.jui@broadcom.com>
      Reviewed-by: default avatarScott Branden <scott.branden@broadcom.com>
      Signed-off-by: default avatarJassi Brar <jaswinder.singh@linaro.org>
      d7bf31a0
    • Thierry Reding's avatar
      mailbox: Export mbox_flush() · 4f055779
      Thierry Reding authored
      The mbox_flush() function can be used by drivers that are built as
      modules, so the function needs to be exported.
      Reported-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarJassi Brar <jaswinder.singh@linaro.org>
      4f055779
    • Linus Torvalds's avatar
      Linux 5.0-rc7 · a3b22b9f
      Linus Torvalds authored
      a3b22b9f
  5. 17 Feb, 2019 12 commits