1. 15 Jun, 2018 9 commits
  2. 13 Jun, 2018 2 commits
    • Anders Roxell's avatar
      selftests: bpf: config: add config fragments · 3bce593a
      Anders Roxell authored
      Tests test_tunnel.sh fails due to config fragments ins't enabled.
      
      Fixes: 933a741e ("selftests/bpf: bpf tunnel test.")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      3bce593a
    • Yonghong Song's avatar
      tools/bpftool: fix a bug in bpftool perf · 73df93c5
      Yonghong Song authored
      Commit b04df400 ("tools/bpftool: add perf subcommand")
      introduced bpftool subcommand perf to query bpf program
      kuprobe and tracepoint attachments.
      
      The perf subcommand will first test whether bpf subcommand
      BPF_TASK_FD_QUERY is supported in kernel or not. It does it
      by opening a file with argv[0] and feeds the file descriptor
      and current task pid to the kernel for querying.
      
      Such an approach won't work if the argv[0] cannot be opened
      successfully in the current directory. This is especially
      true when bpftool is accessible through PATH env variable.
      The error below reflects the open failure for file argv[0]
      at home directory.
      
        [yhs@localhost ~]$ which bpftool
        /usr/local/sbin/bpftool
        [yhs@localhost ~]$ bpftool perf
        Error: perf_query_support: No such file or directory
      
      To fix the issue, let us open root directory ("/")
      which exists in every linux system. With the fix, the
      error message will correctly reflect the permission issue.
      
        [yhs@localhost ~]$ which bpftool
        /usr/local/sbin/bpftool
        [yhs@localhost ~]$ bpftool perf
        Error: perf_query_support: Operation not permitted
        HINT: non root or kernel doesn't support TASK_FD_QUERY
      
      Fixes: b04df400 ("tools/bpftool: add perf subcommand")
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      73df93c5
  3. 12 Jun, 2018 4 commits
    • Björn Töpel's avatar
      xsk: re-add queue id check for XDP_SKB path · 5d902372
      Björn Töpel authored
      Commit 173d3adb ("xsk: add zero-copy support for Rx") introduced a
      regression on the XDP_SKB receive path, when the queue id checks were
      removed. Now, they are back again.
      
      Fixes: 173d3adb ("xsk: add zero-copy support for Rx")
      Reported-by: default avatarQi Zhang <qi.z.zhang@intel.com>
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      5d902372
    • David Miller's avatar
      tcp: Do not reload skb pointer after skb_gro_receive(). · 6892286e
      David Miller authored
      This is not necessary.  skb_gro_receive() will never change what
      'head' points to.
      
      In it's original implementation (see commit 71d93b39 ("net: Add
      skb_gro_receive")), it did:
      
      ====================
      +	*head = nskb;
      +	nskb->next = p->next;
      +	p->next = NULL;
      ====================
      
      This sequence was removed in commit 58025e46 ("net: gro: remove
      obsolete code from skb_gro_receive()")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      6892286e
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 0ca69d13
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-06-12
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Avoid an allocation warning in AF_XDP by adding __GFP_NOWARN for the
         umem setup, from Björn.
      
      2) Silence a warning in bpf fs when an application tries to open(2) a
         pinned bpf obj due to missing fops. Add a dummy open fop that continues
         to just bail out in such case, from Daniel.
      
      3) Fix a BPF selftest urandom_read build issue where gcc complains that
         it gets built twice, from Anders.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ca69d13
    • David S. Miller's avatar
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 93ba168a
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2018-06-11
      
      This series contains fixes to ixgbe IPsec and MACVLAN.
      
      Alex provides the 5 fixes in this series, starting with fixing an issue
      where num_rx_pools was not being populated until after the queues and
      interrupts were reinitialized when enabling MACVLAN interfaces.  Updated
      to use CONFIG_XFRM_OFFLOAD instead of CONFIG_XFRM, since the code
      requires CONFIG_XFRM_OFFLOAD to be enabled.  Moved the IPsec
      initialization function to be more consistent with the placement of
      similar initialization functions and before the call to reset the
      hardware, which will clean up any link issues that may have been
      introduced.  Fixed the boolean logic that was testing for transmit OR
      receive ready bits, when it should have been testing for transmit AND
      receive ready bits.  Fixed the bit definitions for SECTXSTAT and SECRXSTAT
      registers and ensure that if IPsec is disabled on the part, do not
      enable it.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93ba168a
  4. 11 Jun, 2018 13 commits
    • David Ahern's avatar
      net/ipv6: Ensure cfg is properly initialized in ipv6_create_tempaddr · 3f2d67b6
      David Ahern authored
      Valdis reported a BUG in ipv6_add_addr:
      
      [ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209
      [ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10
      [ 1820.832732] Code: 49 8b 1f 0f 84 6a 0a 00 00 48 85 db 0f 84 4e 0a 00 00 48 8b 03 48 8b 53 08 49 89 45 00 49 8b 47 10
      49 89 55 08 48 85 c0 74 15 <48> 8b 50 08 48 8b 00 49 89 95 b8 01 00 00 49 89 85 b0 01 00 00 4c
      [ 1820.832847] RSP: 0018:ffffaa07c2fd7880 EFLAGS: 00010202
      [ 1820.832853] RAX: 0000000000000201 RBX: ffffaa07c2fd79b0 RCX: 0000000000000000
      [ 1820.832858] RDX: a4cfbfba2cbfa64c RSI: 0000000000000000 RDI: ffffffff8a8e9fa0
      [ 1820.832862] RBP: ffffaa07c2fd7920 R08: 000000000000017a R09: ffffffff8a555300
      [ 1820.832866] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888d18e71c00
      [ 1820.832871] R13: ffff888d0a9b1200 R14: 0000000000000000 R15: ffffaa07c2fd7980
      [ 1820.832876] FS:  00007faa51bdb800(0000) GS:ffff888d1d400000(0000) knlGS:0000000000000000
      [ 1820.832880] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1820.832885] CR2: 0000000000000209 CR3: 000000021e8f8001 CR4: 00000000001606e0
      [ 1820.832888] Call Trace:
      [ 1820.832898]  ? __local_bh_enable_ip+0x119/0x260
      [ 1820.832904]  ? ipv6_create_tempaddr+0x259/0x5a0
      [ 1820.832912]  ? __local_bh_enable_ip+0x139/0x260
      [ 1820.832921]  ipv6_create_tempaddr+0x2da/0x5a0
      [ 1820.832926]  ? ipv6_create_tempaddr+0x2da/0x5a0
      [ 1820.832941]  manage_tempaddrs+0x1a5/0x240
      [ 1820.832951]  inet6_addr_del+0x20b/0x3b0
      [ 1820.832959]  ? nla_parse+0xce/0x1e0
      [ 1820.832968]  inet6_rtm_deladdr+0xd9/0x210
      [ 1820.832981]  rtnetlink_rcv_msg+0x1d4/0x5f0
      
      Looking at the code I found 1 element (peer_pfx) of the newly introduced
      ifa6_config struct that is not initialized. Use a memset rather than hard
      coding an init for each struct element.
      Reported-by: default avatarValdis Kletnieks <valdis.kletnieks@vt.edu>
      Fixes: e6464b8c ("net/ipv6: Convert ipv6_add_addr to struct ifa6_config")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f2d67b6
    • Daniel Borkmann's avatar
      tls: fix NULL pointer dereference on poll · f6fadff3
      Daniel Borkmann authored
      While hacking on kTLS, I ran into the following panic from an
      unprivileged netserver / netperf TCP session:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        PGD 800000037f378067 P4D 800000037f378067 PUD 3c0e61067 PMD 0
        Oops: 0010 [#1] SMP KASAN PTI
        CPU: 1 PID: 2289 Comm: netserver Not tainted 4.17.0+ #139
        Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
        RIP: 0010:          (null)
        Code: Bad RIP value.
        RSP: 0018:ffff88036abcf740 EFLAGS: 00010246
        RAX: dffffc0000000000 RBX: ffff88036f5f6800 RCX: 1ffff1006debed26
        RDX: ffff88036abcf920 RSI: ffff8803cb1a4f00 RDI: ffff8803c258c280
        RBP: ffff8803c258c280 R08: ffff8803c258c280 R09: ffffed006f559d48
        R10: ffff88037aacea43 R11: ffffed006f559d49 R12: ffff8803c258c280
        R13: ffff8803cb1a4f20 R14: 00000000000000db R15: ffffffffc168a350
        FS:  00007f7e631f4700(0000) GS:ffff8803d1c80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffffffffffffffd6 CR3: 00000003ccf64005 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         ? tls_sw_poll+0xa4/0x160 [tls]
         ? sock_poll+0x20a/0x680
         ? do_select+0x77b/0x11a0
         ? poll_schedule_timeout.constprop.12+0x130/0x130
         ? pick_link+0xb00/0xb00
         ? read_word_at_a_time+0x13/0x20
         ? vfs_poll+0x270/0x270
         ? deref_stack_reg+0xad/0xe0
         ? __read_once_size_nocheck.constprop.6+0x10/0x10
        [...]
      
      Debugging further, it turns out that calling into ctx->sk_poll() is
      invalid since sk_poll itself is NULL which was saved from the original
      TCP socket in order for tls_sw_poll() to invoke it.
      
      Looks like the recent conversion from poll to poll_mask callback started
      in 15252423 ("net: add support for ->poll_mask in proto_ops") missed
      to eventually convert kTLS, too: TCP's ->poll was converted over to the
      ->poll_mask in commit 2c7d3dac ("net/tcp: convert to ->poll_mask")
      and therefore kTLS wrongly saved the ->poll old one which is now NULL.
      
      Convert kTLS over to use ->poll_mask instead. Also instead of POLLIN |
      POLLRDNORM use the proper EPOLLIN | EPOLLRDNORM bits as the case in
      tcp_poll_mask() as well that is mangled here.
      
      Fixes: 2c7d3dac ("net/tcp: convert to ->poll_mask")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Watson <davejwatson@fb.com>
      Tested-by: default avatarDave Watson <davejwatson@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6fadff3
    • Björn Töpel's avatar
      xsk: silence warning on memory allocation failure · a343993c
      Björn Töpel authored
      syzkaller reported a warning from xdp_umem_pin_pages():
      
        WARNING: CPU: 1 PID: 4537 at mm/slab_common.c:996 kmalloc_slab+0x56/0x70 mm/slab_common.c:996
        ...
        __do_kmalloc mm/slab.c:3713 [inline]
        __kmalloc+0x25/0x760 mm/slab.c:3727
        kmalloc_array include/linux/slab.h:634 [inline]
        kcalloc include/linux/slab.h:645 [inline]
        xdp_umem_pin_pages net/xdp/xdp_umem.c:205 [inline]
        xdp_umem_reg net/xdp/xdp_umem.c:318 [inline]
        xdp_umem_create+0x5c9/0x10f0 net/xdp/xdp_umem.c:349
        xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531
        __sys_setsockopt+0x1bd/0x390 net/socket.c:1935
        __do_sys_setsockopt net/socket.c:1946 [inline]
        __se_sys_setsockopt net/socket.c:1943 [inline]
        __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943
        do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      This is a warning about attempting to allocate more than
      KMALLOC_MAX_SIZE memory. The request originates from userspace, and if
      the request is too big, the kernel is free to deny its allocation. In
      this patch, the failed allocation attempt is silenced with
      __GFP_NOWARN.
      
      Fixes: c0c77d8f ("xsk: add user memory registration support sockopt")
      Reported-by: syzbot+4abadc5d69117b346506@syzkaller.appspotmail.com
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a343993c
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · a08ce73b
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains Netfilter/IPVS fixes for your net tree:
      
      1) Reject non-null terminated helper names from xt_CT, from Gao Feng.
      
      2) Fix KASAN splat due to out-of-bound access from commit phase, from
         Alexey Kodanev.
      
      3) Missing conntrack hook registration on IPVS FTP helper, from Julian
         Anastasov.
      
      4) Incorrect skbuff allocation size in bridge nft_reject, from Taehee Yoo.
      
      5) Fix inverted check on packet xmit to non-local addresses, also from
         Julian.
      
      6) Fix ebtables alignment compat problems, from Alin Nastac.
      
      7) Hook mask checks are not correct in xt_set, from Serhey Popovych.
      
      8) Fix timeout listing of element in ipsets, from Jozsef.
      
      9) Cap maximum timeout value in ipset, also from Jozsef.
      
      10) Don't allow family option for hash:mac sets, from Florent Fourcot.
      
      11) Restrict ebtables to work with NFPROTO_BRIDGE targets only, this
          Florian.
      
      12) Another bug reported by KASAN in the rbtree set backend, from
          Taehee Yoo.
      
      13) Missing __IPS_MAX_BIT update doesn't include IPS_OFFLOAD_BIT.
          From Gao Feng.
      
      14) Missing initialization of match/target in ebtables, from Florian
          Westphal.
      
      15) Remove useless nft_dup.h file in include path, from C. Labbe.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a08ce73b
    • Zhouyang Jia's avatar
      net: dsa: add error handling for pskb_trim_rcsum · 349b71d6
      Zhouyang Jia authored
      When pskb_trim_rcsum fails, the lack of error-handling code may
      cause unexpected results.
      
      This patch adds error-handling code after calling pskb_trim_rcsum.
      Signed-off-by: default avatarZhouyang Jia <jiazhouyang09@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      349b71d6
    • Julian Anastasov's avatar
      ipv6: allow PMTU exceptions to local routes · 09757646
      Julian Anastasov authored
      IPVS setups with local client and remote tunnel server need
      to create exception for the local virtual IP. What we do is to
      change PMTU from 64KB (on "lo") to 1460 in the common case.
      Suggested-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Fixes: 7343ff31 ("ipv6: Don't create clones of host routes.")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09757646
    • Alexander Duyck's avatar
      ixgbe: Fix bit definitions and add support for testing for ipsec support · 421d954c
      Alexander Duyck authored
      This patch addresses two issues. First it adds the correct bit definitions
      for the SECTXSTAT and SECRXSTAT registers. Then it makes use of those
      definitions to test for if IPsec has been disabled on the part and if so we
      do not enable it.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Reported-by: default avatarAndre Tomt <andre@tomt.net>
      Acked-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      421d954c
    • Alexander Duyck's avatar
      ixgbe: Avoid loopback and fix boolean logic in ipsec_stop_data · e9f655ee
      Alexander Duyck authored
      This patch fixes two issues. First we add an early test for the Tx and Rx
      security block ready bits. By doing this we can avoid the need for waits or
      loopback in the event that the security block is already flushed out.
      Secondly we fix the boolean logic that was testing for the Tx OR Rx ready
      bits being set and change it so that we only exit if the Tx AND Rx ready
      bits are both set.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e9f655ee
    • Alexander Duyck's avatar
      ixgbe: Move ipsec init function to before reset call · de7a7e34
      Alexander Duyck authored
      This patch moves the IPsec init function in ixgbe_sw_init. This way it is a
      bit more consistent with the placement of similar initialization functions
      and is placed before the reset_hw call which should allow us to clean up
      any link issues that may be introduced by the fact that we force the link
      up if somehow the device had IPsec still enabled before the driver was
      loaded.
      
      In addition to the function move it is necessary to change the assignment
      of netdev->features. The easiest way to do this is to just test for the
      existence of adapter->ipsec and if it is present we set the feature bits.
      
      Fixes: 49a94d74 ("ixgbe: add ipsec engine start and stop routines")
      Reported-by: default avatarAndre Tomt <andre@tomt.net>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      de7a7e34
    • Alexander Duyck's avatar
      ixgbe: Use CONFIG_XFRM_OFFLOAD instead of CONFIG_XFRM · e433f3a5
      Alexander Duyck authored
      There is no point in adding code if CONFIG_XFRM is defined that we won't
      use unless CONFIG_XFRM_OFFLOAD is defined. So instead of leaving this code
      floating around I am replacing the ifdef with what I believe is the correct
      one so that we only include the code and variables if they will actually be
      used.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Acked-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      e433f3a5
    • Alexander Duyck's avatar
      ixgbe: Fix setting of TC configuration for macvlan case · 646bb57c
      Alexander Duyck authored
      When we were enabling macvlan interfaces we weren't correctly configuring
      things until ixgbe_setup_tc was called a second time either by tweaking the
      number of queues or increasing the macvlan count past 15.
      
      The issue came down to the fact that num_rx_pools is not populated until
      after the queues and interrupts are reinitialized.
      
      Instead of trying to set it sooner we can just move the call to setup at
      least 1 traffic class to the SR-IOV/VMDq setup function so that we just set
      it for this one case. We already had a spot that was configuring the queues
      for TC 0 in the code here anyway so it makes sense to also set the number
      of TCs here as well.
      
      Fixes: 49cfbeb7 ("ixgbe: Fix handling of macvlan Tx offload")
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      646bb57c
    • Anders Roxell's avatar
      selftests: bpf: fix urandom_read build issue · 1c9ca7e9
      Anders Roxell authored
      gcc complains that urandom_read gets built twice.
      
      gcc -o tools/testing/selftests/bpf/urandom_read
      -static urandom_read.c -Wl,--build-id
      gcc -Wall -O2 -I../../../include/uapi -I../../../lib -I../../../lib/bpf
      -I../../../../include/generated  -I../../../include    urandom_read.c
      urandom_read -lcap -lelf -lrt -lpthread -o
      tools/testing/selftests/bpf/urandom_read
      gcc: fatal error: input file
      ‘tools/testing/selftests/bpf/urandom_read’ is the
      same as output file
      compilation terminated.
      ../lib.mk:110: recipe for target
      'tools/testing/selftests/bpf/urandom_read' failed
      To fix this issue remove the urandom_read target and so target
      TEST_CUSTOM_PROGS gets used.
      
      Fixes: 81f77fd0 ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      1c9ca7e9
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f0dc7f9c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix several bpfilter/UMH bugs, in particular make the UMH build not
          depend upon X86 specific Kconfig symbols. From Alexei Starovoitov.
      
       2) Fix handling of modified context pointer in bpf verifier, from
          Daniel Borkmann.
      
       3) Kill regression in ifdown/ifup sequences for hv_netvsc driver, from
          Dexuan Cui.
      
       4) When the bonding primary member name changes, we have to re-evaluate
          the bond->force_primary setting, from Xiangning Yu.
      
       5) Eliminate possible padding beyone end of SKB in cdc_ncm driver, from
          Bjørn Mork.
      
       6) RX queue length reported for UDP sockets in procfs and socket diag
          are inaccurate, from Paolo Abeni.
      
       7) Fix br_fdb_find_port() locking, from Petr Machata.
      
       8) Limit sk_rcvlowat values properly in TCP, from Soheil Hassas
          Yeganeh.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (23 commits)
        tcp: limit sk_rcvlowat by the maximum receive buffer
        net: phy: dp83822: use BMCR_ANENABLE instead of BMSR_ANEGCAPABLE for DP83620
        socket: close race condition between sock_close() and sockfs_setattr()
        net: bridge: Fix locking in br_fdb_find_port()
        udp: fix rx queue len reported by diag and proc interface
        cdc_ncm: avoid padding beyond end of skb
        net/sched: act_simple: fix parsing of TCA_DEF_DATA
        net: fddi: fix a possible null-ptr-deref
        net: aquantia: fix unsigned numvecs comparison with less than zero
        net: stmmac: fix build failure due to missing COMMON_CLK dependency
        bpfilter: fix race in pipe access
        bpf, xdp: fix crash in xdp_umem_unaccount_pages
        xsk: Fix umem fill/completion queue mmap on 32-bit
        tools/bpf: fix selftest get_cgroup_id_user
        bpfilter: fix OUTPUT_FORMAT
        umh: fix race condition
        net: mscc: ocelot: Fix uninitialized error in ocelot_netdevice_event()
        bonding: re-evaluate force_primary when the primary slave name changes
        ip_tunnel: Fix name string concatenate in __ip_tunnel_create()
        hv_netvsc: Fix a network regression after ifdown/ifup
        ...
      f0dc7f9c
  5. 10 Jun, 2018 12 commits
    • Linus Torvalds's avatar
      Merge tag 'rtc-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 1aaccb5f
      Linus Torvalds authored
      Pull RTC updates from Alexandre Belloni:
       "Setting the supported range from drivers for RTCs failing soon has
        started. A few fixes are developed along the way. Some drivers have
        been switched to SPDX by their maintainers.
      
        Subsystem:
      
         - rework of the rtc-test driver which allows to test the core more
           thoroughly
      
         - rtc_set_alarm() now fails early when alarms are not supported
      
        Drivers:
      
         - mktime() is now replaced by mktime64()
      
         - RTC range added for 88pm80x, ab-b5ze-s3, at91rm9200,
           brcmstb-waketimer, ds1685, ftrtc010, ls1x, mxc_v2, rx8581, sprd,
           st-lpc, tps6586x, tps65910 and vr41xx
      
         - fixed a possible race condition in probe functions
      
         - pxa: fix the probe function that is broken since v4.3
      
         - stm32: now supports stm32mp1"
      
      * tag 'rtc-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (78 commits)
        rtc: pxa: fix probe function
        rtc: cros-ec: Switch to SPDX identifier.
        rtc: cros-ec: Make license text and module license match.
        rtc: ensure rtc_set_alarm fails when alarms are not supported
        rtc: test: remove alarm support from the first device
        rtc: test: convert to devm_rtc_allocate_device
        rtc: ftrtc010: let the core handle range
        rtc: ftrtc010: handle dates after 2106
        rtc: ftrtc010: switch to devm_rtc_allocate_device
        rtc: mrst: switch to devm functions
        rtc: sunxi: fix possible race condition
        rtc: test: remove irq sysfs file
        rtc: test: emulate alarms using timers
        rtc: test: store time as an offset to system time
        rtc: test: allow registering many devices
        rtc: test: remove useless proc info
        rtc: ds1685: Add range
        rtc: ds1685: fix possible race condition
        rtc: sprd: Add new RTC power down check method
        rtc: sun6i: Fix bit_idx value for clk_register_gate
        ...
      1aaccb5f
    • Linus Torvalds's avatar
      Merge tag 'upstream-4.18-rc1' of git://git.infradead.org/linux-ubifs · ab0b2e59
      Linus Torvalds authored
      Pull UBI and UBIFS updates from Richard Weinberger:
      
       - the UBI on-disk format header file is now dual licensed
      
       - new way to detect Fastmap problems during runtime
      
       - bugfix for Fastmap
      
       - minor updates for UBIFS (spelling, comments, vm_fault_t, ...)
      
      * tag 'upstream-4.18-rc1' of git://git.infradead.org/linux-ubifs:
        mtd: ubi: Update ubi-media.h to dual license
        ubi: fastmap: Detect EBA mismatches on-the-fly
        ubi: fastmap: Check each mapping only once
        ubi: fastmap: Correctly handle interrupted erasures in EBA
        ubi: fastmap: Cancel work upon detach
        ubifs: lpt: Fix wrong pnode number range in comment
        ubifs: gc: Fix typo
        ubifs: log: Some spelling fixes
        ubifs: Spelling fix someting -> something
        ubifs: journal: Remove wrong comment
        ubifs: remove set but never used variable
        ubifs, xattr: remove misguided quota flags
        fs: ubifs: Adding new return type vm_fault_t
      ab0b2e59
    • Soheil Hassas Yeganeh's avatar
      tcp: limit sk_rcvlowat by the maximum receive buffer · 867f816b
      Soheil Hassas Yeganeh authored
      The user-provided value to setsockopt(SO_RCVLOWAT) can be
      larger than the maximum possible receive buffer. Such values
      mute POLLIN signals on the socket which can stall progress
      on the socket.
      
      Limit the user-provided value to half of the maximum receive
      buffer, i.e., half of sk_rcvbuf when the receive buffer size
      is set by the user, or otherwise half of sysctl_tcp_rmem[2].
      
      Fixes: d1361840 ("tcp: fix SO_RCVLOWAT and RCVBUF autotuning")
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      867f816b
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 5f85942c
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       "This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
        xfcp, hisi_sas, cxlflash, qla2xxx.
      
        In the absence of Nic, we're also taking target updates which are
        mostly minor except for the tcmu refactor.
      
        The only real core change to worry about is the removal of high page
        bouncing (in sas, storvsc and iscsi). This has been well tested and no
        problems have shown up so far"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits)
        scsi: lpfc: update driver version to 12.0.0.4
        scsi: lpfc: Fix port initialization failure.
        scsi: lpfc: Fix 16gb hbas failing cq create.
        scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
        scsi: lpfc: correct oversubscription of nvme io requests for an adapter
        scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
        scsi: hisi_sas: Mark PHY as in reset for nexus reset
        scsi: hisi_sas: Fix return value when get_free_slot() failed
        scsi: hisi_sas: Terminate STP reject quickly for v2 hw
        scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
        scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
        scsi: hisi_sas: Try wait commands before before controller reset
        scsi: hisi_sas: Init disks after controller reset
        scsi: hisi_sas: Create a scsi_host_template per HW module
        scsi: hisi_sas: Reset disks when discovered
        scsi: hisi_sas: Add LED feature for v3 hw
        scsi: hisi_sas: Change common allocation mode of device id
        scsi: hisi_sas: change slot index allocation mode
        scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
        scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
        ...
      5f85942c
    • Alvaro Gamez Machado's avatar
      net: phy: dp83822: use BMCR_ANENABLE instead of BMSR_ANEGCAPABLE for DP83620 · b718e8c8
      Alvaro Gamez Machado authored
      DP83620 register set is compatible with the DP83848, but it also supports
      100base-FX. When the hardware is configured such as that fiber mode is
      enabled, autonegotiation is not possible.
      
      The chip, however, doesn't expose this information via BMSR_ANEGCAPABLE.
      Instead, this bit is always set high, even if the particular hardware
      configuration makes it so that auto negotiation is not possible [1]. Under
      these circumstances, the phy subsystem keeps trying for autonegotiation to
      happen, without success.
      
      Hereby, we inspect BMCR_ANENABLE bit after genphy_config_init, which on
      reset is set to 0 when auto negotiation is disabled, and so we use this
      value instead of BMSR_ANEGCAPABLE.
      
      [1] https://e2e.ti.com/support/interface/ethernet/f/903/p/697165/2571170Signed-off-by: default avatarAlvaro Gamez Machado <alvaro.gamez@hazent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b718e8c8
    • Cong Wang's avatar
      socket: close race condition between sock_close() and sockfs_setattr() · 6d8c50dc
      Cong Wang authored
      fchownat() doesn't even hold refcnt of fd until it figures out
      fd is really needed (otherwise is ignored) and releases it after
      it resolves the path. This means sock_close() could race with
      sockfs_setattr(), which leads to a NULL pointer dereference
      since typically we set sock->sk to NULL in ->release().
      
      As pointed out by Al, this is unique to sockfs. So we can fix this
      in socket layer by acquiring inode_lock in sock_close() and
      checking against NULL in sockfs_setattr().
      
      sock_release() is called in many places, only the sock_close()
      path matters here. And fortunately, this should not affect normal
      sock_close() as it is only called when the last fd refcnt is gone.
      It only affects sock_close() with a parallel sockfs_setattr() in
      progress, which is not common.
      
      Fixes: 86741ec2 ("net: core: Add a UID field to struct sock.")
      Reported-by: default avatarshankarapailoor <shankarapailoor@gmail.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d8c50dc
    • Linus Torvalds's avatar
      Merge tag '4.18-fixes-smb3' of git://git.samba.org/sfrench/cifs-2.6 · 0c14e43a
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
      
       - one smb3 (ACL related) fix for stable
      
       - one SMB3 security enhancement (when mounting -t smb3 forbid less
         secure dialects)
      
       - some RDMA and compounding fixes
      
      * tag '4.18-fixes-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix a buffer leak in smb2_query_symlink
        smb3: do not allow insecure cifs mounts when using smb3
        CIFS: Fix NULL ptr deref
        CIFS: fix encryption in SMB3.1.1
        CIFS: Pass page offset for encrypting
        CIFS: Pass page offset for calculating signature
        CIFS: SMBD: Support page offset in memory registration
        CIFS: SMBD: Support page offset in RDMA recv
        CIFS: SMBD: Support page offset in RDMA send
        CIFS: When sending data on socket, pass the correct page offset
        CIFS: Introduce helper function to get page offset and length in smb_rqst
        CIFS: Calculate the correct request length based on page offset and tail size
        cifs: For SMB2 security informaion query, check for minimum sized security descriptor instead of sizeof FileAllInformation class
        CIFS: Fix signing for SMB2/3
      0c14e43a
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180610' of git://git.kernel.dk/linux-block · bbaa1013
      Linus Torvalds authored
      Pull block flush handling fix from Jens Axboe:
       "Single fix that we should merge now, fixing a regression in queuing
        flush request, accessing request flags after calling the end_request
        handler"
      
      * tag 'for-linus-20180610' of git://git.kernel.dk/linux-block:
        block: fix use-after-free in block flush handling
      bbaa1013
    • Linus Torvalds's avatar
      Merge branch 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d82991a8
      Linus Torvalds authored
      Pull restartable sequence support from Thomas Gleixner:
       "The restartable sequences syscall (finally):
      
        After a lot of back and forth discussion and massive delays caused by
        the speculative distraction of maintainers, the core set of
        restartable sequences has finally reached a consensus.
      
        It comes with the basic non disputed core implementation along with
        support for arm, powerpc and x86 and a full set of selftests
      
        It was exposed to linux-next earlier this week, so it does not fully
        comply with the merge window requirements, but there is really no
        point to drag it out for yet another cycle"
      
      * 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        rseq/selftests: Provide Makefile, scripts, gitignore
        rseq/selftests: Provide parametrized tests
        rseq/selftests: Provide basic percpu ops test
        rseq/selftests: Provide basic test
        rseq/selftests: Provide rseq library
        selftests/lib.mk: Introduce OVERRIDE_TARGETS
        powerpc: Wire up restartable sequences system call
        powerpc: Add syscall detection for restartable sequences
        powerpc: Add support for restartable sequences
        x86: Wire up restartable sequence system call
        x86: Add support for restartable sequences
        arm: Wire up restartable sequences system call
        arm: Add syscall detection for restartable sequences
        arm: Add restartable sequences support
        rseq: Introduce restartable sequences system call
        uapi/headers: Provide types_32_64.h
      d82991a8
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f4e5b30d
      Linus Torvalds authored
      Pull x86 updates and fixes from Thomas Gleixner:
      
       - Fix the (late) fallout from the vector management rework causing
         hlist corruption and irq descriptor reference leaks caused by a
         missing sanity check.
      
         The straight forward fix triggered another long standing issue to
         surface. The pre rework code hid the issue due to being way slower,
         but now the chance that user space sees an EBUSY error return when
         updating irq affinities is way higher, though quite a bunch of
         userspace tools do not handle it properly despite the fact that EBUSY
         could be returned for at least 10 years.
      
         It turned out that the EBUSY return can be avoided completely by
         utilizing the existing delayed affinity update mechanism for irq
         remapped scenarios as well. That's a bit more error handling in the
         kernel, but avoids fruitless fingerpointing discussions with tool
         developers.
      
       - Decouple PHYSICAL_MASK from AMD SME as its going to be required for
         the upcoming Intel memory encryption support as well.
      
       - Handle legacy device ACPI detection properly for newer platforms
      
       - Fix the wrong argument ordering in the vector allocation tracepoint
      
       - Simplify the IDT setup code for the APIC=n case
      
       - Use the proper string helpers in the MTRR code
      
       - Remove a stale unused VDSO source file
      
       - Convert the microcode update lock to a raw spinlock as its used in
         atomic context.
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
        x86/apic/vector: Print APIC control bits in debugfs
        genirq/affinity: Defer affinity setting if irq chip is busy
        x86/platform/uv: Use apic_ack_irq()
        x86/ioapic: Use apic_ack_irq()
        irq_remapping: Use apic_ack_irq()
        x86/apic: Provide apic_ack_irq()
        genirq/migration: Avoid out of line call if pending is not set
        genirq/generic_pending: Do not lose pending affinity update
        x86/apic/vector: Prevent hlist corruption and leaks
        x86/vector: Fix the args of vector_alloc tracepoint
        x86/idt: Simplify the idt_setup_apic_and_irq_gates()
        x86/platform/uv: Remove extra parentheses
        x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME
        x86: Mark native_set_p4d() as __always_inline
        x86/microcode: Make the late update update_lock a raw lock for RT
        x86/mtrr: Convert to use strncpy_from_user() helper
        x86/mtrr: Convert to use match_string() helper
        x86/vdso: Remove unused file
        x86/i8237: Register device based on FADT legacy boot flag
      f4e5b30d
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a2211de0
      Linus Torvalds authored
      Pull x86 pti updates from Thomas Gleixner:
       "Three small commits updating the SSB mitigation to take the updated
        AMD mitigation variants into account"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features
        x86/bugs: Add AMD's SPEC_CTRL MSR usage
        x86/bugs: Add AMD's variant of SSB_NO
      a2211de0
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2322d6c5
      Linus Torvalds authored
      Pull more perf tooling updates from Thomas Gleixner:
       "Perf tool updates and fixes:
      
        perf stat:
      
         - Display user and system time for workload targets (Jiri Olsa)
      
        perf record:
      
         - Enable arbitrary event names thru name= modifier (Alexey Budankov)
      
        PowerPC:
      
         - Add a python script for hypervisor call statistics (Ravi Bangoria)
      
        Intel PT: (Adrian Hunter)
      
         - Fix sync_switch INTEL_PT_SS_NOT_TRACING
      
         - Fix decoding to accept CBR between FUP and corresponding TIP
      
         - Fix MTC timing after overflow
      
         - Fix "Unexpected indirect branch" error
      
        perf test:
      
         - record+probe_libc_inet_pton:
            - To get the symbol table for dynamic shared objects on ubuntu we
              need to pass the -D/--dynamic command line option, unlike with
              the fedora distros (Arnaldo Carvalho de Melo)
      
         - code-reading:
            - Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
      
         - kmod-path:
            - Add tests for vdso32 and vdsox32 (Adrian Hunter)
      
         - Use header file util/debug.h (Thomas Richter)
      
        perf annotate:
      
         - Make the various UI backends (stdio, TUI, gtk) use more
           consistently structs with annotation options as specified by the
           user (Arnaldo Carvalho de Melo)
      
         - Move annotation specific knobs from the symbol_conf global kitchen
           sink to the annotation option structs (Arnaldo Carvalho de Melo)
      
        perf script:
      
         - Add more PMU fields to python scripts event handler dict (Jin Yao)
      
        Core:
      
         - Fix misleading error for some unparsable events mentioning PMUs
           when those are not involved in the problem (Jiri Olsa)
      
         - Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
           (Arnaldo Carvalho de Melo)
      
         - Be more robust when trying to use per-symbol histograms, checking
           for unlikely but possible cases where the space for the histograms
           wasn't allocated, print a debug message for such cases (Arnaldo
           Carvalho de Melo)
      
         - Fix symbol and object code resolution for vdso32 and vdsox32
           (Adrian Hunter)
      
         - No need to check for null when passing pointers to foo__get() style
           refcount grabbing helpers, just like in the kernel and with free(),
           its safe to pass a NULL pointer to avoid having to check it before
           each and every foo__get() call (Arnaldo Carvalho de Melo)
      
         - Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
      
         - Remove some needless globals, making them local (Arnaldo Carvalho
           de Melo)
      
         - Reduce usage of symbol_conf.use_callchain, using other means of
           finding out if callchains are in use or available for specific
           events, as we evolved this codebase to allow requesting callchains
           for just a subset of the monitored events. In time it will help
           polish recording and showing mixed sets accross the various tools:
      
              perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'
      
           (Arnaldo Carvalho de Melo)
      
         - Consider PTI entry trampolines in map__rip_2objdump() (Adrian
           Hunter)"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
        perf script python: Add dict fields introduction to Documentation
        perf script python: Add more PMU fields to event handler dict
        perf script python: Move dsoname code to a new function
        perf symbols: Add BSS symbols when reading from /proc/kallsyms
        perf annnotate: Make __symbol__inc_addr_samples handle src->histograms == NULL
        perf intel-pt: Fix "Unexpected indirect branch" error
        perf intel-pt: Fix MTC timing after overflow
        perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
        perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
        perf script powerpc: Python script for hypervisor call statistics
        perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols
        perf map: Consider PTI entry trampolines in rip_2objdump()
        perf test code-reading: Fix perf_env setup for PTI entry trampolines
        perf tools: Fix pmu events parsing rule
        perf stat: Display user and system time
        perf record: Enable arbitrary event names thru name= modifier
        perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
        perf tests kmod-path: Add tests for vdso32 and vdsox32
        perf hists: Check if a hist_entry has callchains before using them
        perf hists: Introduce hist_entry__has_callchain() method
        ...
      2322d6c5