1. 07 Mar, 2019 10 commits
    • Jakub Sitnicki's avatar
      bpf: Stop the psock parser before canceling its work · e8e34377
      Jakub Sitnicki authored
      We might have never enabled (started) the psock's parser, in which case it
      will not get stopped when destroying the psock. This leads to a warning
      when trying to cancel parser's work from psock's deferred destructor:
      
      [  405.325769] WARNING: CPU: 1 PID: 3216 at net/strparser/strparser.c:526 strp_done+0x3c/0x40
      [  405.326712] Modules linked in: [last unloaded: test_bpf]
      [  405.327359] CPU: 1 PID: 3216 Comm: kworker/1:164 Tainted: G        W         5.0.0 #42
      [  405.328294] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
      [  405.329712] Workqueue: events sk_psock_destroy_deferred
      [  405.330254] RIP: 0010:strp_done+0x3c/0x40
      [  405.330706] Code: 28 e8 b8 d5 6b ff 48 8d bb 80 00 00 00 e8 9c d5 6b ff 48 8b 7b 18 48 85 ff 74 0d e8 1e a5 e8 ff 48 c7 43 18 00 00 00 00 5b c3 <0f> 0b eb cf 66 66 66 66 90 55 89 f5 53 48 89 fb 48 83 c7 28 e8 0b
      [  405.332862] RSP: 0018:ffffc900026bbe50 EFLAGS: 00010246
      [  405.333482] RAX: ffffffff819323e0 RBX: ffff88812cb83640 RCX: ffff88812cb829e8
      [  405.334228] RDX: 0000000000000001 RSI: ffff88812cb837e8 RDI: ffff88812cb83640
      [  405.335366] RBP: ffff88813fd22680 R08: 0000000000000000 R09: 000073746e657665
      [  405.336472] R10: 8080808080808080 R11: 0000000000000001 R12: ffff88812cb83600
      [  405.337760] R13: 0000000000000000 R14: ffff88811f401780 R15: ffff88812cb837e8
      [  405.338777] FS:  0000000000000000(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000
      [  405.339903] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  405.340821] CR2: 00007fb11489a6b8 CR3: 000000012d4d6000 CR4: 00000000000406e0
      [  405.341981] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  405.343131] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  405.344415] Call Trace:
      [  405.344821]  sk_psock_destroy_deferred+0x23/0x1b0
      [  405.345585]  process_one_work+0x1ae/0x3e0
      [  405.346110]  worker_thread+0x3c/0x3b0
      [  405.346576]  ? pwq_unbound_release_workfn+0xd0/0xd0
      [  405.347187]  kthread+0x11d/0x140
      [  405.347601]  ? __kthread_parkme+0x80/0x80
      [  405.348108]  ret_from_fork+0x35/0x40
      [  405.348566] ---[ end trace a4a3af4026a327d4 ]---
      
      Stop psock's parser just before canceling its work.
      
      Fixes: 1d79895a ("sk_msg: Always cancel strp work before freeing the psock")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e8e34377
    • Stanislav Fomichev's avatar
      selftests: bpf: test_progs: initialize duration in singal_pending test · 69b09175
      Stanislav Fomichev authored
      CHECK macro implicitly uses duration. We call CHECK() a couple of times
      before duration is initialized from bpf_prog_test_run().
      Explicitly set duration to 0 to avoid compiler warnings.
      
      Fixes: 740f8a65 ("selftests/bpf: make sure signal interrupts BPF_PROG_TEST_RUN")
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      69b09175
    • Stanislav Fomichev's avatar
      libbpf: force fixdep compilation at the start of the build · 8e268887
      Stanislav Fomichev authored
      libbpf targets don't explicitly depend on fixdep target, so when
      we do 'make -j$(nproc)', there is a high probability, that some
      objects will be built before fixdep binary is available.
      
      Fix this by running sub-make; this makes sure that fixdep dependency
      is properly accounted for.
      
      For the same issue in perf, see commit abb26210 ("perf tools: Force
      fixdep compilation at the start of the build").
      
      Before:
      
      $ rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld -C tools/lib/bpf/
      
      Auto-detecting system features:
      ...                        libelf: [ on  ]
      ...                           bpf: [ on  ]
      
        HOSTCC   /tmp/bld/fixdep.o
        CC       /tmp/bld/libbpf.o
        CC       /tmp/bld/bpf.o
        CC       /tmp/bld/btf.o
        CC       /tmp/bld/nlattr.o
        CC       /tmp/bld/libbpf_errno.o
        CC       /tmp/bld/str_error.o
        CC       /tmp/bld/netlink.o
        CC       /tmp/bld/bpf_prog_linfo.o
        CC       /tmp/bld/libbpf_probes.o
        CC       /tmp/bld/xsk.o
        HOSTLD   /tmp/bld/fixdep-in.o
        LINK     /tmp/bld/fixdep
        LD       /tmp/bld/libbpf-in.o
        LINK     /tmp/bld/libbpf.a
        LINK     /tmp/bld/libbpf.so
        LINK     /tmp/bld/test_libbpf
      
      $ head /tmp/bld/.libbpf.o.cmd
       # cannot find fixdep (/usr/local/google/home/sdf/src/linux/xxx//fixdep)
       # using basic dep data
      
      /tmp/bld/libbpf.o: libbpf.c /usr/include/stdc-predef.h \
       /usr/include/stdlib.h /usr/include/features.h \
       /usr/include/x86_64-linux-gnu/sys/cdefs.h \
       /usr/include/x86_64-linux-gnu/bits/wordsize.h \
       /usr/include/x86_64-linux-gnu/gnu/stubs.h \
       /usr/include/x86_64-linux-gnu/gnu/stubs-64.h \
       /usr/lib/gcc/x86_64-linux-gnu/7/include/stddef.h \
      
      After:
      
      $ rm -rf /tmp/bld; mkdir /tmp/bld; make -j$(nproc) O=/tmp/bld -C tools/lib/bpf/
      
      Auto-detecting system features:
      ...                        libelf: [ on  ]
      ...                           bpf: [ on  ]
      
        HOSTCC   /tmp/bld/fixdep.o
        HOSTLD   /tmp/bld/fixdep-in.o
        LINK     /tmp/bld/fixdep
        CC       /tmp/bld/libbpf.o
        CC       /tmp/bld/bpf.o
        CC       /tmp/bld/nlattr.o
        CC       /tmp/bld/btf.o
        CC       /tmp/bld/libbpf_errno.o
        CC       /tmp/bld/str_error.o
        CC       /tmp/bld/netlink.o
        CC       /tmp/bld/bpf_prog_linfo.o
        CC       /tmp/bld/libbpf_probes.o
        CC       /tmp/bld/xsk.o
        LD       /tmp/bld/libbpf-in.o
        LINK     /tmp/bld/libbpf.a
        LINK     /tmp/bld/libbpf.so
        LINK     /tmp/bld/test_libbpf
      
      $ head /tmp/bld/.libbpf.o.cmd
      cmd_/tmp/bld/libbpf.o := gcc -Wp,-MD,/tmp/bld/.libbpf.o.d -Wp,-MT,/tmp/bld/libbpf.o -g -Wall -DHAVE_LIBELF_MMAP_SUPPORT -DCOMPAT_NEED_REALLOCARRAY -Wbad-function-cast -Wdeclaration-after-statement -Wformat-security -Wformat-y2k -Winit-self -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wno-system-headers -Wold-style-definition -Wpacked -Wredundant-decls -Wshadow -Wstrict-prototypes -Wswitch-default -Wswitch-enum -Wundef -Wwrite-strings -Wformat -Wstrict-aliasing=3 -Werror -Wall -fPIC -I. -I/usr/local/google/home/sdf/src/linux/tools/include -I/usr/local/google/home/sdf/src/linux/tools/arch/x86/include/uapi -I/usr/local/google/home/sdf/src/linux/tools/include/uapi -fvisibility=hidden -D"BUILD_STR(s)=$(pound)s" -c -o /tmp/bld/libbpf.o libbpf.c
      
      source_/tmp/bld/libbpf.o := libbpf.c
      
      deps_/tmp/bld/libbpf.o := \
        /usr/include/stdc-predef.h \
        /usr/include/stdlib.h \
        /usr/include/features.h \
        /usr/include/x86_64-linux-gnu/sys/cdefs.h \
        /usr/include/x86_64-linux-gnu/bits/wordsize.h \
      
      Fixes: 7c422f55 ("tools build: Build fixdep helper from perf and basic libs")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      8e268887
    • Stanislav Fomichev's avatar
      selftests: bpf: fix compilation with out-of-tree $(OUTPUT) · e78e00bd
      Stanislav Fomichev authored
      A bunch of related changes lumped together:
      * Create prog_tests and verifier output directories; these don't exist with
        out-of-tree $(OUTPUT)
      * Add missing -I (via separate TEST_{PROGS,VERIFIER}_CFLAGS) for the main tree
        ($(PWD) != $(OUTPUT) for out-of-tree)
      * Add libbpf.a dependency for test_progs_32 (parallel make fails otherwise)
      * Add missing "; \" after "cd" when generating test.h headers
      
      Tested by:
      $ alias m="make -s -j$(nproc)"
      $ m -C tools/testing/selftests/bpf/ clean
      $ m -C tools/lib/bpf/ clean
      $ rm -rf xxx; mkdir xxx; m -C tools/testing/selftests/bpf/ OUTPUT=$PWD/xxx
      $ m -C tools/testing/selftests/bpf/
      
      Fixes: 3f306588 ("selftests: bpf: break up test_progs - preparations")
      Fixes: 2dfb4012 ("selftests: bpf: prepare for break up of verifier tests")
      Fixes: 3ef84346 ("selftests: bpf: makefile support sub-register code-gen test mode")
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e78e00bd
    • Peter Oskolkov's avatar
      selftests/bpf: test that GSO works in lwt_ip_encap · 17a90a78
      Peter Oskolkov authored
      Add a test on egress that a large TCP packet successfully goes through
      the lwt+bpf encap tunnel.
      
      Although there is no direct evidence that GSO worked, as opposed to
      e.g. TCP segmentation or IP fragmentation (maybe a kernel stats counter
      should be added to track the number of failed GSO attempts?), without
      the previous patch in the patchset this test fails, and printk-debugging
      showed that software-based GSO succeeded here (veth is not compatible with
      SKB_GSO_DODGY, so GSO happens in the software stack).
      
      Also removed an unnecessary nodad and added a missed failed flag.
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      17a90a78
    • Peter Oskolkov's avatar
      net: fix GSO in bpf_lwt_push_ip_encap · ea0371f7
      Peter Oskolkov authored
      GSO needs inner headers and inner protocol set properly to work.
      
      skb->inner_mac_header: skb_reset_inner_headers() assigns the current
      mac header value to inner_mac_header; but it is not set at the point,
      so we need to call skb_reset_inner_mac_header, otherwise gre_gso_segment
      fails: it does
      
          int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
          ...
          if (unlikely(!pskb_may_pull(skb, tnl_hlen)))
          ...
      
      skb->inner_protocol should also be correctly set.
      
      Fixes: ca78801a ("bpf: handle GSO in bpf_lwt_push_encap")
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ea0371f7
    • Eric Dumazet's avatar
      xsk: fix potential crash in xsk_diag_put_umem() · 915905f8
      Eric Dumazet authored
      Fixes two typos in xsk_diag_put_umem()
      
      syzbot reported the following crash :
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 7641 Comm: syz-executor946 Not tainted 5.0.0-rc7+ #95
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:xsk_diag_put_umem net/xdp/xsk_diag.c:71 [inline]
      RIP: 0010:xsk_diag_fill net/xdp/xsk_diag.c:113 [inline]
      RIP: 0010:xsk_diag_dump+0xdcb/0x13a0 net/xdp/xsk_diag.c:143
      Code: 8d be c0 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 39 04 00 00 49 8b 96 c0 04 00 00 48 8d 7a 14 48 89 f8 48 c1 e8 03 <42> 0f b6 0c 20 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
      RSP: 0018:ffff888090bcf2d8 EFLAGS: 00010203
      RAX: 0000000000000002 RBX: ffff8880a0aacbc0 RCX: ffffffff86ffdc3c
      RDX: 0000000000000000 RSI: ffffffff86ffdc70 RDI: 0000000000000014
      RBP: ffff888090bcf438 R08: ffff88808e04a700 R09: ffffed1011c74174
      R10: ffffed1011c74173 R11: ffff88808e3a0b9f R12: dffffc0000000000
      R13: ffff888093a6d818 R14: ffff88808e365240 R15: ffff88808e3a0b40
      FS:  00000000011ea880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000080 CR3: 000000008fa13000 CR4: 00000000001406e0
      Call Trace:
       netlink_dump+0x55d/0xfb0 net/netlink/af_netlink.c:2252
       __netlink_dump_start+0x5b4/0x7e0 net/netlink/af_netlink.c:2360
       netlink_dump_start include/linux/netlink.h:226 [inline]
       xsk_diag_handler_dump+0x1b2/0x250 net/xdp/xsk_diag.c:170
       __sock_diag_cmd net/core/sock_diag.c:232 [inline]
       sock_diag_rcv_msg+0x322/0x410 net/core/sock_diag.c:263
       netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485
       sock_diag_rcv+0x2b/0x40 net/core/sock_diag.c:274
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xdd/0x130 net/socket.c:632
       sock_write_iter+0x27c/0x3e0 net/socket.c:923
       call_write_iter include/linux/fs.h:1863 [inline]
       do_iter_readv_writev+0x5e0/0x8e0 fs/read_write.c:680
       do_iter_write fs/read_write.c:956 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:937
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1001
       do_writev+0xf6/0x290 fs/read_write.c:1036
       __do_sys_writev fs/read_write.c:1109 [inline]
       __se_sys_writev fs/read_write.c:1106 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1106
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440139
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffcc966cc18 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440139
      RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
      R10: 0000000000000004 R11: 0000000000000246 R12: 00000000004019c0
      R13: 0000000000401a50 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      ---[ end trace 460a3c24d0a656c9 ]---
      RIP: 0010:xsk_diag_put_umem net/xdp/xsk_diag.c:71 [inline]
      RIP: 0010:xsk_diag_fill net/xdp/xsk_diag.c:113 [inline]
      RIP: 0010:xsk_diag_dump+0xdcb/0x13a0 net/xdp/xsk_diag.c:143
      Code: 8d be c0 04 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 39 04 00 00 49 8b 96 c0 04 00 00 48 8d 7a 14 48 89 f8 48 c1 e8 03 <42> 0f b6 0c 20 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
      RSP: 0018:ffff888090bcf2d8 EFLAGS: 00010203
      RAX: 0000000000000002 RBX: ffff8880a0aacbc0 RCX: ffffffff86ffdc3c
      RDX: 0000000000000000 RSI: ffffffff86ffdc70 RDI: 0000000000000014
      RBP: ffff888090bcf438 R08: ffff88808e04a700 R09: ffffed1011c74174
      R10: ffffed1011c74173 R11: ffff88808e3a0b9f R12: dffffc0000000000
      R13: ffff888093a6d818 R14: ffff88808e365240 R15: ffff88808e3a0b40
      FS:  00000000011ea880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000001d22000 CR3: 000000008fa13000 CR4: 00000000001406f0
      
      Fixes: a36b38aa ("xsk: add sock_diag interface for AF_XDP")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Magnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      915905f8
    • Colin Ian King's avatar
      bpf: hbm: fix spelling mistake "deault" -> "default" · 5b4f21b2
      Colin Ian King authored
      There are a couple of typos, fix these.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      5b4f21b2
    • Willem de Bruijn's avatar
      bpf: only test gso type on gso packets · 4c3024de
      Willem de Bruijn authored
      BPF can adjust gso only for tcp bytestreams. Fail on other gso types.
      
      But only on gso packets. It does not touch this field if !gso_size.
      
      Fixes: b90efd22 ("bpf: only adjust gso_size on bytestream protocols")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4c3024de
    • Arnd Bergmann's avatar
      bpf: fix sysctl.c warning · 78c3aff8
      Arnd Bergmann authored
      When CONFIG_BPF_SYSCALL or CONFIG_SYSCTL is disabled, we get
      a warning about an unused function:
      
      kernel/sysctl.c:3331:12: error: 'proc_dointvec_minmax_bpf_stats' defined but not used [-Werror=unused-function]
       static int proc_dointvec_minmax_bpf_stats(struct ctl_table *table, int write,
      
      The CONFIG_BPF_SYSCALL check was already handled, but the SYSCTL check
      is needed on top.
      
      Fixes: 492ecee8 ("bpf: enable program stats")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      78c3aff8
  2. 06 Mar, 2019 6 commits
    • Vasily Averin's avatar
      tcp: detecting the misuse of .sendpage for Slab objects · a10674bf
      Vasily Averin authored
      sendpage was not designed for processing of the Slab pages,
      in some situations it can trigger BUG_ON on receiving side.
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a10674bf
    • Arnd Bergmann's avatar
      appletalk: Add atalk.h header files to MAINTAINERS file · 7b837623
      Arnd Bergmann authored
      Add the path names here so that git-send-email can pick up the
      netdev@vger.kernel.org Cc line automatically for a patch that
      only touches the headers.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b837623
    • Arnd Bergmann's avatar
      appletalk: Fix compile regression · 27da0d2e
      Arnd Bergmann authored
      A bugfix just broke compilation of appletalk when CONFIG_SYSCTL
      is disabled:
      
      In file included from net/appletalk/ddp.c:65:
      net/appletalk/ddp.c: In function 'atalk_init':
      include/linux/atalk.h:164:34: error: expected expression before 'do'
       #define atalk_register_sysctl()  do { } while(0)
                                        ^~
      net/appletalk/ddp.c:1934:7: note: in expansion of macro 'atalk_register_sysctl'
        rc = atalk_register_sysctl();
      
      This is easier to avoid by using conventional inline functions
      as stubs rather than macros. The header already has inline
      functions for other purposes, so I'm changing over all the
      macros for consistency.
      
      Fixes: 6377f787 ("appletalk: Fix use-after-free in atalk_proc_exit")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27da0d2e
    • Alan Maguire's avatar
      iptunnel: NULL pointer deref for ip_md_tunnel_xmit · f4b3ec4e
      Alan Maguire authored
      Naresh Kamboju noted the following oops during execution of selftest
      tools/testing/selftests/bpf/test_tunnel.sh on x86_64:
      
      [  274.120445] BUG: unable to handle kernel NULL pointer dereference
      at 0000000000000000
      [  274.128285] #PF error: [INSTR]
      [  274.131351] PGD 8000000414a0e067 P4D 8000000414a0e067 PUD 3b6334067 PMD 0
      [  274.138241] Oops: 0010 [#1] SMP PTI
      [  274.141734] CPU: 1 PID: 11464 Comm: ping Not tainted
      5.0.0-rc4-next-20190129 #1
      [  274.149046] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
      2.0b 07/27/2017
      [  274.156526] RIP: 0010:          (null)
      [  274.160280] Code: Bad RIP value.
      [  274.163509] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
      [  274.168726] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
      [  274.175851] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
      [  274.182974] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
      [  274.190098] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
      [  274.197222] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
      [  274.204346] FS:  00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
      knlGS:0000000000000000
      [  274.212424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  274.218162] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
      [  274.225292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  274.232416] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  274.239541] Call Trace:
      [  274.241988]  ? tnl_update_pmtu+0x296/0x3b0
      [  274.246085]  ip_md_tunnel_xmit+0x1bc/0x520
      [  274.250176]  gre_fb_xmit+0x330/0x390
      [  274.253754]  gre_tap_xmit+0x128/0x180
      [  274.257414]  dev_hard_start_xmit+0xb7/0x300
      [  274.261598]  sch_direct_xmit+0xf6/0x290
      [  274.265430]  __qdisc_run+0x15d/0x5e0
      [  274.269007]  __dev_queue_xmit+0x2c5/0xc00
      [  274.273011]  ? dev_queue_xmit+0x10/0x20
      [  274.276842]  ? eth_header+0x2b/0xc0
      [  274.280326]  dev_queue_xmit+0x10/0x20
      [  274.283984]  ? dev_queue_xmit+0x10/0x20
      [  274.287813]  arp_xmit+0x1a/0xf0
      [  274.290952]  arp_send_dst.part.19+0x46/0x60
      [  274.295138]  arp_solicit+0x177/0x6b0
      [  274.298708]  ? mod_timer+0x18e/0x440
      [  274.302281]  neigh_probe+0x57/0x70
      [  274.305684]  __neigh_event_send+0x197/0x2d0
      [  274.309862]  neigh_resolve_output+0x18c/0x210
      [  274.314212]  ip_finish_output2+0x257/0x690
      [  274.318304]  ip_finish_output+0x219/0x340
      [  274.322314]  ? ip_finish_output+0x219/0x340
      [  274.326493]  ip_output+0x76/0x240
      [  274.329805]  ? ip_fragment.constprop.53+0x80/0x80
      [  274.334510]  ip_local_out+0x3f/0x70
      [  274.337992]  ip_send_skb+0x19/0x40
      [  274.341391]  ip_push_pending_frames+0x33/0x40
      [  274.345740]  raw_sendmsg+0xc15/0x11d0
      [  274.349403]  ? __might_fault+0x85/0x90
      [  274.353151]  ? _copy_from_user+0x6b/0xa0
      [  274.357070]  ? rw_copy_check_uvector+0x54/0x130
      [  274.361604]  inet_sendmsg+0x42/0x1c0
      [  274.365179]  ? inet_sendmsg+0x42/0x1c0
      [  274.368937]  sock_sendmsg+0x3e/0x50
      [  274.372460]  ___sys_sendmsg+0x26f/0x2d0
      [  274.376293]  ? lock_acquire+0x95/0x190
      [  274.380043]  ? __handle_mm_fault+0x7ce/0xb70
      [  274.384307]  ? lock_acquire+0x95/0x190
      [  274.388053]  ? __audit_syscall_entry+0xdd/0x130
      [  274.392586]  ? ktime_get_coarse_real_ts64+0x64/0xc0
      [  274.397461]  ? __audit_syscall_entry+0xdd/0x130
      [  274.401989]  ? trace_hardirqs_on+0x4c/0x100
      [  274.406173]  __sys_sendmsg+0x63/0xa0
      [  274.409744]  ? __sys_sendmsg+0x63/0xa0
      [  274.413488]  __x64_sys_sendmsg+0x1f/0x30
      [  274.417405]  do_syscall_64+0x55/0x190
      [  274.421064]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  274.426113] RIP: 0033:0x7ff4ae0e6e87
      [  274.429686] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00
      00 00 00 8b 05 ca d9 2b 00 48 63 d2 48 63 ff 85 c0 75 10 b8 2e 00 00
      00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 53 48 89 f3 48 83 ec 10 48 89 7c
      24 08
      [  274.448422] RSP: 002b:00007ffcd9b76db8 EFLAGS: 00000246 ORIG_RAX:
      000000000000002e
      [  274.455978] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 00007ff4ae0e6e87
      [  274.463104] RDX: 0000000000000000 RSI: 00000000006092e0 RDI: 0000000000000003
      [  274.470228] RBP: 0000000000000000 R08: 00007ffcd9bc40a0 R09: 00007ffcd9bc4080
      [  274.477349] R10: 000000000000060a R11: 0000000000000246 R12: 0000000000000003
      [  274.484475] R13: 0000000000000016 R14: 00007ffcd9b77fa0 R15: 00007ffcd9b78da4
      [  274.491602] Modules linked in: cls_bpf sch_ingress iptable_filter
      ip_tables algif_hash af_alg x86_pkg_temp_thermal fuse [last unloaded:
      test_bpf]
      [  274.504634] CR2: 0000000000000000
      [  274.507976] ---[ end trace 196d18386545eae1 ]---
      [  274.512588] RIP: 0010:          (null)
      [  274.516334] Code: Bad RIP value.
      [  274.519557] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
      [  274.524775] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
      [  274.531921] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
      [  274.539082] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
      [  274.546205] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
      [  274.553329] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
      [  274.560456] FS:  00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
      knlGS:0000000000000000
      [  274.568541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  274.574277] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
      [  274.581403] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  274.588535] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  274.595658] Kernel panic - not syncing: Fatal exception in interrupt
      [  274.602046] Kernel Offset: 0x14400000 from 0xffffffff81000000
      (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [  274.612827] ---[ end Kernel panic - not syncing: Fatal exception in
      interrupt ]---
      [  274.620387] ------------[ cut here ]------------
      
      I'm also seeing the same failure on x86_64, and it reproduces
      consistently.
      
      >From poking around it looks like the skb's dst entry is being used
      to calculate the mtu in:
      
      mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
      
      ...but because that dst_entry  has an "ops" value set to md_dst_ops,
      the various ops (including mtu) are not set:
      
      crash> struct sk_buff._skb_refdst ffff928f87447700 -x
            _skb_refdst = 0xffffcd6fbf5ea590
      crash> struct dst_entry.ops 0xffffcd6fbf5ea590
        ops = 0xffffffffa0193800
      crash> struct dst_ops.mtu 0xffffffffa0193800
        mtu = 0x0
      crash>
      
      I confirmed that the dst entry also has dst->input set to
      dst_md_discard, so it looks like it's an entry that's been
      initialized via __metadata_dst_init alright.
      
      I think the fix here is to use skb_valid_dst(skb) - it checks
      for  DST_METADATA also, and with that fix in place, the
      problem - which was previously 100% reproducible - disappears.
      
      The below patch resolves the panic and all bpf tunnel tests pass
      without incident.
      
      Fixes: c8b34e68 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Tested-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Reported-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Tested-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4b3ec4e
    • Paolo Abeni's avatar
      ipv4/route: fail early when inet dev is missing · 22c74764
      Paolo Abeni authored
      If a non local multicast packet reaches ip_route_input_rcu() while
      the ingress device IPv4 private data (in_dev) is NULL, we end up
      doing a NULL pointer dereference in IN_DEV_MFORWARD().
      
      Since the later call to ip_route_input_mc() is going to fail if
      !in_dev, we can fail early in such scenario and avoid the dangerous
      code path.
      
      v1 -> v2:
       - clarified the commit message, no code changes
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Fixes: e58e4159 ("net: Enable support for VRF with ipv4 multicast")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22c74764
    • Dan Carpenter's avatar
      net: hns3: Fix a logical vs bitwise typo · f4772dee
      Dan Carpenter authored
      There were a couple logical ORs accidentally mixed in with the bitwise
      ORs.
      
      Fixes: e8149933 ("net: hns3: remove hnae3_get_bit in data path")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4772dee
  3. 05 Mar, 2019 21 commits
    • wenxu's avatar
      net/sched: act_tunnel_key: Fix double free dst_cache · 4177c5d9
      wenxu authored
      dst_cache_destroy will be called in dst_release
      
      dst_release-->dst_destroy_rcu-->dst_destroy-->metadata_dst_free
      -->dst_cache_destroy
      
      It should not call dst_cache_destroy before dst_release
      
      Fixes: 41411e2f ("net/sched: act_tunnel_key: Add dst_cache support")
      Signed-off-by: default avatarwenxu <wenxu@ucloud.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4177c5d9
    • Erik Hugne's avatar
      tipc: fix RDM/DGRAM connect() regression · 0e632089
      Erik Hugne authored
      Fix regression bug introduced in
      commit 365ad353 ("tipc: reduce risk of user starvation during link
      congestion")
      
      Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached
      sockaddr.
      
      Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
      Signed-off-by: default avatarErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e632089
    • Linus Torvalds's avatar
      Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · d9862cfb
      Linus Torvalds authored
      Pull MIPS updates from Paul Burton:
      
       - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
         (GINVT) instructions, allowing for more efficient TLB maintenance
         when running on a CPU such as the I6500 that supports these.
      
       - Enable huge page support for MIPS64r6.
      
       - Optimize post-DMA cache sync by removing that code entirely for
         kernel configurations in which we know it won't be needed.
      
       - The number of pages allocated for interrupt stacks is now calculated
         correctly, where before we would wastefully allocate too much memory
         in some configurations.
      
       - The ath79 platform migrates to devicetree.
      
       - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.
      
       - The ingenic/jz4740 platform gains support for appended devicetrees.
      
       - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
         cleanups as do various pieces of core architecture code.
      
      * tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits)
        MIPS: lantiq: Remove separate GPHY Firmware loader
        MIPS: ingenic: Add support for appended devicetree
        MIPS: SGI-IP27: rework HUB interrupts
        MIPS: SGI-IP27: do boot CPU init later
        MIPS: SGI-IP27: do xtalk scanning later
        MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output
        MIPS: SGI-IP27: clean up bridge access and header files
        MIPS: SGI-IP27: get rid of volatile and hubreg_t
        MIPS: irq: Allocate accurate order pages for irq stack
        MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()
        MIPS: eBPF: Remove REG_32BIT_ZERO_EX
        MIPS: eBPF: Always return sign extended 32b values
        MIPS: CM: Fix indentation
        MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support
        MIPS: OCTEON: program rx/tx-delay always from DT
        MIPS: OCTEON: delete board-specific link status
        MIPS: OCTEON: don't lie about interface type of CN3005 board
        MIPS: OCTEON: warn if deprecated link status is being used
        MIPS: OCTEON: add fixed-link nodes to in-kernel device tree
        MIPS: Delete unused flush_cache_sigtramp()
        ...
      d9862cfb
    • Linus Torvalds's avatar
      Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 8feed3ef
      Linus Torvalds authored
      Pull parisc updates from Helge Deller:
       "The most important changes in this patch set are:
      
         - DMA-related cleanups for parisc with the aim to move anything not
           required by drivers out of <asm/dma-mapping.h>, by Christoph
           Hellwig
      
         - Switch to memblock_alloc(), by Mike Rapoport
      
         - Makefile cleanups by Masahiro Yamada
      
         - Switch to bust_spinlocks(), by Sergey Senozhatsky
      
         - Improved initial SMP affinity selection for IRQs
      
         - Added IPI- and rescheduling interrupts in /proc/interrupts output"
      
      * 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits)
        parisc: use memblock_alloc() instead of custom get_memblock()
        parisc: Add constants for various PDC firmware calls
        parisc: Add constant for PDC_PAT_COMPLEX firmware call
        parisc: Show machine product number during boot
        parisc: Add constants for PDC_RELOCATE PDC call
        parisc: Add PDC_CRASH_PREP PDC function number
        parisc: Use F_EXTEND() macro in iosapic code
        parisc: remove the HBA_DATA macro
        parisc/lba_pci: use container_of in LBA_DEV
        parisc/dino: use container_of in DINO_DEV
        parisc: properly type the return value of parisc_walk_tree
        parisc: properly type the iommu field in struct pci_hba_data
        parisc: turn GET_IOC into an inline function
        parisc: move internal implementation details out of <asm/dma-mapping.h>
        parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h>
        parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile
        parisc: replace oops_in_progress manipulation with bust_spinlocks()
        parisc: Improve initial IRQ to CPU assignment
        parisc: Count IPI function call interrupts
        parisc: Show rescheduling interrupts on SMP machines only
        ...
      8feed3ef
    • Linus Torvalds's avatar
      Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3591b195
      Linus Torvalds authored
      Pull s390 updates from Martin Schwidefsky:
      
       - A copy of Arnds compat wrapper generation series
      
       - Pass information about the KVM guest to the host in form the control
         program code and the control program version code
      
       - Map IOV resources to support PCI physical functions on s390
      
       - Add vector load and store alignment hints to improve performance
      
       - Use the "jdd" constraint with gcc 9 to make jump labels working again
      
       - Remove amode workaround for old z/VM releases from the DCSS code
      
       - Add support for in-kernel performance measurements using the CPU
         measurement counter facility
      
       - Introduce a new PMU device cpum_cf_diag to capture counters and store
         thenn as event raw data.
      
       - Bug fixes and cleanups
      
      * tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
        Revert "s390/cpum_cf: Add kernel message exaplanations"
        s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
        s390/suspend: fix prefix register reset in swsusp_arch_resume
        s390: warn about clearing als implied facilities
        s390: allow overriding facilities via command line
        s390: clean up redundant facilities list setup
        s390/als: remove duplicated in-place implementation of stfle
        s390/cio: Use cpa range elsewhere within vfio-ccw
        s390/cio: Fix vfio-ccw handling of recursive TICs
        s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
        s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
        s390/cpum_cf: Add kernel message exaplanations
        s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
        s390/cpum_cf: add ctr_stcctm() function
        s390/cpum_cf: move common functions into a separate file
        s390/cpum_cf: introduce kernel_cpumcf_avail() function
        s390/cpu_mf: replace stcctm5() with the stcctm() function
        s390/cpu_mf: add store cpu counter multiple instruction support
        s390/cpum_cf: Add minimal in-kernel interface for counter measurements
        s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
        ...
      3591b195
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 45f5532a
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
       - VLA removal
      
       - gcc-8.x build fixes
      
       - small improvements and cleanups
      
       - defconfig updates
      
      * tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: Add -ffreestanding to CFLAGS
        m68k/apollo: Fix comment in Makefile
        dio: Fix buffer overflow in case of unknown board
        m68k/defconfig: Update defconfigs for v5.0-rc1
        m68k/atari: Avoid VLA use in atari_switches_setup()
        m68k: Avoid VLA use in mangle_kernel_stack()
        m68k/mac: Use '030 reset method on SE/30
        m68k/mac: Remove obsolete comment
        m68k/mac: Skip VIA port setup unless RTC is connected
        m68k/mac: Clean up unused timer definitions
        m68k/defconfig: Drop NET_VENDOR_<FOO>=n
      45f5532a
    • Borislav Petkov's avatar
      x86: Deprecate a.out support · eac61655
      Borislav Petkov authored
      Linux supports ELF binaries for ~25 years now.  a.out coredumping has
      bitrotten quite significantly and would need some fixing to get it into
      shape again but considering how even the toolchains cannot create a.out
      executables in its default configuration, let's deprecate a.out support
      and remove it a couple of releases later, instead.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarRichard Weinberger <richard@nod.at>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-fsdevel@vger.kernel.org>
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <x86@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eac61655
    • Linus Torvalds's avatar
      a.out: remove core dumping support · 08300f44
      Linus Torvalds authored
      We're (finally) phasing out a.out support for good.  As Borislav Petkov
      points out, we've supported ELF binaries for about 25 years by now, and
      coredumping in particular has bitrotted over the years.
      
      None of the tool chains even support generating a.out binaries any more,
      and the plan is to deprecate a.out support entirely for the kernel.  But
      I want to start with just removing the core dumping code, because I can
      still imagine that somebody actually might want to support a.out as a
      simpler biinary format.
      
      Particularly if you generate some random binaries on the fly, ELF is a
      much more complicated format (admittedly ELF also does have a lot of
      toolchain support, mitigating that complexity a lot and you really
      should have moved over in the last 25 years).
      
      So it's at least somewhat possible that somebody out there has some
      workflow that still involves generating and running a.out executables.
      
      In contrast, it's very unlikely that anybody depends on debugging any
      legacy a.out core files.  But regardless, I want this phase-out to be
      done in two steps, so that we can resurrect a.out support (if needed)
      without having to resurrect the core file dumping that is almost
      certainly not needed.
      
      Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
      cut at this had missed.
      
      And Alan Cox points out that the a.out binary loader _could_ be done in
      user space if somebody wants to, but we might keep just the loader in
      the kernel if somebody really wants it, since the loader isn't that big
      and has no really odd special cases like the core dumping does.
      Acked-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jannh@google.com>
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08300f44
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 63bdf428
      Linus Torvalds authored
      Pull crypto update from Herbert Xu:
       "API:
         - Add helper for simple skcipher modes.
         - Add helper to register multiple templates.
         - Set CRYPTO_TFM_NEED_KEY when setkey fails.
         - Require neither or both of export/import in shash.
         - AEAD decryption test vectors are now generated from encryption
           ones.
         - New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
           fuzzing.
      
        Algorithms:
         - Conversions to skcipher and helper for many templates.
         - Add more test vectors for nhpoly1305 and adiantum.
      
        Drivers:
         - Add crypto4xx prng support.
         - Add xcbc/cmac/ecb support in caam.
         - Add AES support for Exynos5433 in s5p.
         - Remove sha384/sha512 from artpec7 as hardware cannot do partial
           hash"
      
      [ There is a merge of the Freescale SoC tree in order to pull in changes
        required by patches to the caam/qi2 driver. ]
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
        crypto: s5p - add AES support for Exynos5433
        dt-bindings: crypto: document Exynos5433 SlimSSS
        crypto: crypto4xx - add missing of_node_put after of_device_is_available
        crypto: cavium/zip - fix collision with generic cra_driver_name
        crypto: af_alg - use struct_size() in sock_kfree_s()
        crypto: caam - remove redundant likely/unlikely annotation
        crypto: s5p - update iv after AES-CBC op end
        crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
        crypto: caam - generate hash keys in-place
        crypto: caam - fix DMA mapping xcbc key twice
        crypto: caam - fix hash context DMA unmap size
        hwrng: bcm2835 - fix probe as platform device
        crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
        crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
        crypto: chelsio - Fixed Traffic Stall
        crypto: marvell - Remove set but not used variable 'ivsize'
        crypto: ccp - Update driver messages to remove some confusion
        crypto: adiantum - add 1536 and 4096-byte test vectors
        crypto: nhpoly1305 - add a test vector with len % 16 != 0
        crypto: arm/aes-ce - update IV after partial final CTR block
        ...
      63bdf428
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · 64563003
      Linus Torvalds authored
      Pull networking updates from David Miller:
       "Here we go, another merge window full of networking and #ebpf changes:
      
         1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
            range without dealing with floods of ARP traffic, from Linus
            Lüssing.
      
         2) Throttle buffered multicast packet transmission in mt76, from
            Felix Fietkau.
      
         3) Support adaptive interrupt moderation in ice, from Brett Creeley.
      
         4) A lot of struct_size conversions, from Gustavo A. R. Silva.
      
         5) Add peek/push/pop commands to bpftool, as well as bash completion,
            from Stanislav Fomichev.
      
         6) Optimize sk_msg_clone(), from Vakul Garg.
      
         7) Add SO_BINDTOIFINDEX, from David Herrmann.
      
         8) Be more conservative with local resends due to local congestion,
            from Yuchung Cheng.
      
         9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.
      
        10) Add health buffer support to devlink, from Eran Ben Elisha.
      
        11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.
      
        12) Add statistics to basic packet scheduler filter, from Cong Wang.
      
        13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.
      
        14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.
      
        15) Add 3ad stats to bonding, from Nikolay Aleksandrov.
      
        16) Lots of probing improvements for bpftool, from Quentin Monnet.
      
        17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.
      
        18) Allow #ebpf programs to access gso_segs from skb shared info, from
            Eric Dumazet.
      
        19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.
      
        20) Support 22260 iwlwifi devices, from Luca Coelho.
      
        21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.
      
        22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.
      
        23) Add spinlock support to #ebpf, from Alexei Starovoitov.
      
        24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.
      
        25) Add device infomation API to devlink, from Jakub Kicinski.
      
        26) Add new timestamping socket options which are y2038 safe, from
            Deepa Dinamani.
      
        27) Add RX checksum offloading for various sh_eth chips, from Sergei
            Shtylyov.
      
        28) Flow offload infrastructure, from Pablo Neira Ayuso.
      
        29) Numerous cleanups, improvements, and bug fixes to the PHY layer
            and many drivers from Heiner Kallweit.
      
        30) Lots of changes to try and make packet scheduler classifiers run
            lockless as much as possible, from Vlad Buslov.
      
        31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.
      
        32) Add concurrency tests to tc-tests infrastructure, from Vlad
            Buslov.
      
        33) Add hwmon support to aquantia, from Heiner Kallweit.
      
        34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.
      
        And I would be remiss if I didn't thank the various major networking
        subsystem maintainers for integrating much of this work before I even
        saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
        Johannes Berg, Kalle Valo, and many others. Thank you!"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
        net/sched: avoid unused-label warning
        net: ignore sysctl_devconf_inherit_init_net without SYSCTL
        phy: mdio-mux: fix Kconfig dependencies
        net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
        net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
        selftest/net: Remove duplicate header
        sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
        net/mlx5e: Update tx reporter status in case channels were successfully opened
        devlink: Add support for direct reporter health state update
        devlink: Update reporter state to error even if recover aborted
        sctp: call iov_iter_revert() after sending ABORT
        team: Free BPF filter when unregistering netdev
        ip6mr: Do not call __IP6_INC_STATS() from preemptible context
        isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
        net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
        cxgb4/chtls: Prefix adapter flags with CXGB4
        net-sysfs: Switch to bitmap_zalloc()
        mellanox: Switch to bitmap_zalloc()
        bpf: add test cases for non-pointer sanitiation logic
        mlxsw: i2c: Extend initialization by querying resources data
        ...
      64563003
    • Martin Schwidefsky's avatar
      fcc082f3
    • Linus Torvalds's avatar
      Merge tag 'leds-for-5.1-rc1' of... · cd2a3bf0
      Linus Torvalds authored
      Merge tag 'leds-for-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
      
      Pull LED updates from Jacek Anaszewski:
      
       - finalize previously announced support for initialization of pattern
         triggers from Device Tree
      
       - fix for null deref on firmware load failure in leds-lp55xx-common.c
      
      * tag 'leds-for-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
        leds: lp55xx: fix null deref on firmware load failure
        leds: trigger: timer: Add initialization from Device Tree
        leds: trigger: oneshot: Add initialization from Device Tree
        leds: trigger: pattern: Add pattern initialization from Device Tree
        leds: Add helper for getting default pattern from Device Tree
        dt-bindings: leds: Add pattern initialization from Device Tree
      cd2a3bf0
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 7629bac6
      Linus Torvalds authored
      Pull hwmon updates from Guenter Roeck:
      
       - Add support for LM96000, DPS-650AB to existing drivers
      
       - Use permission specific SENSOR[_DEVICE]_ATTR variants in several
         drivers
      
       - Replace S_<PERMS> with octal values in several drivers
      
       - Update some license headers
      
       - Various minor fixes and improvements in several drivers
      
      * tag 'hwmon-for-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (89 commits)
        dt-bindings: hwmon: Add missing documentation for lm75
        hwmon: (ad7418) Add device tree probing
        hwmon: (ad741x) Add DT bindings for Analog Devices AD741x
        hwmon: (ntc_thermistor) Convert to new hwmon API
        hwmon: (pwm-fan) Add optional regulator support
        dt-bindings: hwmon: Add optional regulator support to pwm-fan
        hwmon: (f71882fg) Mark expected switch fall-through
        hwmon: (ad7418) Catch I2C errors
        hwmon: (lm85) add support for LM96000 high frequencies
        hwmon: (lm85) support the LM96000
        dt-bindings: Add LM96000 as a trivial device
        hwmon: (lm85) remove freq_map size hardcodes
        hwmon: (occ) Fix license headers
        hwmon: (via-cputemp) Use permission specific SENSOR[_DEVICE]_ATTR variants
        hwmon: (vexpress-hwmon) Use permission specific SENSOR[_DEVICE]_ATTR variants
        hwmon: (tmp421) Replace S_<PERMS> with octal values
        hwmon: (tmp103) Use permission specific SENSOR[_DEVICE]_ATTR variants
        hwmon: (tmp102) Replace S_<PERMS> with octal values
        hwmon: (tc74) Use permission specific SENSOR[_DEVICE]_ATTR variants
        hwmon: (tc654) Use permission specific SENSOR[_DEVICE]_ATTR variants
        ...
      7629bac6
    • Linus Torvalds's avatar
      Merge tag 'spi-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · dcc75dde
      Linus Torvalds authored
      Pull spi updates from Mark Brown:
       "A fairly quiet release for SPI, the biggest thing is the conversion to
        use GPIO descriptors which is now 90% done but still needs some
        stragglers converting.
      
        Summary:
      
         - Support for inter-word delays
      
         - Conversion of the core and most drivers to use GPIO descriptors for
           GPIO controlled chip selects
      
         - New drivers for NXP FlexSPI and QuadSPI, SiFive and Spreadtrum"
      
      * tag 'spi-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (104 commits)
        spi: sh-msiof: Restrict bits per word to 8/16/24/32 on R-Car Gen2/3
        spi: sifive: Remove redundant dev_err call in sifive_spi_probe()
        spi: sifive: Remove spi_master_put in sifive_spi_remove()
        spi: spi-gpio: fix SPI_CS_HIGH capability
        spi: pxa2xx: Setup maximum supported DMA transfer length
        spi: sifive: Add driver for the SiFive SPI controller
        spi: sifive: Add DT documentation for SiFive SPI controller
        spi: sprd: Add a prefix for SPI DMA channel macros
        spi: sprd: spi: sprd: Add DMA mode support
        dt-bindings: spi: Add the DMA properties for the SPI dma mode
        spi: sprd: Add the SPI irq function for the SPI DMA mode
        dt-bindings: spi: imx: Add an entry for the i.MX8QM compatible
        spi: use gpio[d]_set_value_cansleep for setting chipselect GPIO
        spi: gpio: Advertise support for SPI_CS_HIGH
        spi: sh-msiof: Replace spi_master by spi_controller
        spi: sh-hspi: Replace spi_master by spi_controller
        spi: rspi: Replace spi_master by spi_controller
        spi: atmel-quadspi: add support for sam9x60 qspi controller
        dt-bindings: spi: atmel-quadspi: QuadSPI driver for Microchip SAM9X60
        spi: atmel-quadspi: add support for named peripheral clock
        ...
      dcc75dde
    • Linus Torvalds's avatar
      Merge tag 'regulator-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · 32c0ac3a
      Linus Torvalds authored
      Pull regulator updates from Mark Brown:
       "The bulk of the standout changes in this release are cleanups, with
        the core work being a combination of factoring out common code into
        helpers and the completion of the conversion of the core to use GPIO
        descriptors.
      
        Summary:
      
         - Addition of helper functions for current limits and conversion of
           drivers to use them by Axel Lin.
      
         - Lots and lots of cleanups from Axel Lin.
      
         - Conversion of the core to use GPIO descriptors rather than numbers
           by Linus Walleij.
      
         - New drivers for Maxim MAX77650 and ROHM BD70528"
      
      * tag 'regulator-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (131 commits)
        regulator: mc13xxx: Constify regulator_ops variables
        regulator: palmas: Constify palmas_smps_ramp_delay array
        regulator: wm831x-dcdc: Convert to use regulator_set/get_current_limit_regmap
        regulator: pv88090: Convert to use regulator_set/get_current_limit_regmap
        regulator: pv88080: Convert to use regulator_set/get_current_limit_regmap
        regulator: pv88060: Convert to use regulator_set/get_current_limit_regmap
        regulator: max77650: Convert to use regulator_set/get_current_limit_regmap
        regulator: lp873x: Convert to use regulator_set/get_current_limit_regmap
        regulator: lp872x: Convert to use regulator_set/get_current_limit_regmap
        regulator: da9210: Convert to use regulator_set/get_current_limit_regmap
        regulator: da9055: Convert to use regulator_set/get_current_limit_regmap
        regulator: core: Add set/get_current_limit helpers for regmap users
        regulator: Fix comment for csel_reg and csel_mask
        regulator: stm32-vrefbuf: add power management support
        regulator: 88pm8607: Remove unused fields from struct pm8607_regulator_info
        regulator: 88pm8607: Simplify pm8607_list_voltage implementation
        regulator: cpcap: Constify omap4_regulators and xoom_regulators
        regulator: cpcap: Remove unused vsel_shift from struct cpcap_regulator
        dt-bindings: regulator: tps65218: rectify units of LS3
        dt-bindings: regulator: add LS2 load switch documentation
        ...
      32c0ac3a
    • Linus Torvalds's avatar
      Merge tag 'regmap-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · e48b044e
      Linus Torvalds authored
      Pull regmap updates from Mark Brown:
       "There are only two changes here:
      
         - fix for conflicting attributes on the rbtree node structure
      
         - implementation of main status register support in the interrupt
           code which supports chips that have a register to cut down on the
           number of per-interrupt status registers that need to be checked
           when handling interrupts"
      
      * tag 'regmap-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: Remove attribute packed from struct 'regcache_rbtree_node'
        regmap: regmap-irq: Add main status register support
      e48b044e
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 42eaf185
      Linus Torvalds authored
      Pull MMC updates from Ulf Hansson:
       "MMC core:
         - Fixup max_discard/trim calculations
         - Announce SD specs greater than 4.0
         - Add discard support for SD cards
         - Don't do retries for CMD6 (SWITCH command)
         - Various cleanups and re-structuring
      
        MMC host:
         - cqhci:
            * Add maintainers for eMMC CQHCI driver
         - sdhci:
            * Consolidate WP GPIO code
            * Add ADMA3 DMA support for V4 enabled host
            * Fixup card detect support in pci-o2micro driver
            * Add support for CMDQ and SDMMC pads auto-calibration in tegra
              driver
            * Add DCMD support and CMDQ support, support for i.MX6ULL variant,
              fixup HS400 timing issue and add HS400_ES support for i.MX8QXP
              to esdhc-imx driver
            * Avoid CRC errors by adjusting settings to speed mode and fixup
              card initialization for high speed mode in renesas_sdhi
            * Fixup timeout settings for omap
            * Enable 8 bits bus-width support in atmel-mci
            * Convert some legacy code in jz4740 driver to use modern APIs
            * Send a CMD12 to clear DPSM at errors for STM32 sdmmc mmci
              driver"
      
      * tag 'mmc-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (69 commits)
        mmc:fix a bug when max_discard is 0
        mmc: core: Add a debug print when the card may have been replaced
        mmc: core: Add sd discard timeout
        mmc: core: Add discard support to sd
        mmc: sdhci-esdhc-imx: clear the HALT bit when enable CQE
        mmc: core: do not retry CMD6 in __mmc_switch()
        mmc: core: Convert mmc_align_data_size() into an SDIO specific function
        mmc: core: Move mmc_of_parse_voltage() to host.c
        mmc: core: Convert mmc_regulator_get_ocrmask() to static
        mmc: core: Move regulator helpers to separate file
        mmc: of_mmc_spi: Convert to mmc_of_parse_voltage()
        mmc: core: Drop retries as in-parameter to mmc_wait_for_app_cmd()
        mmc: core: Convert mmc_wait_for_app_cmd() to static
        mmc: renesas_sdhi: Change HW adjustment register according to speed mode
        mmc: mmci: Send a CMD12 to clear the DPSM at errors
        mmc: sdhci-xenon: Fixup already marked switch fall-through
        mmc: sdhci-tegra: drop ->get_ro() implementation
        mmc: sdhci-omap: drop ->get_ro() implementation
        mmc: sdhci: use WP GPIO in sdhci_check_ro()
        mmc: wmt-sdmmc: Drop unused include
        ...
      42eaf185
    • Linus Torvalds's avatar
      Merge tag 'i3c/for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · c8d950ab
      Linus Torvalds authored
      Pull i3c updates from Boris Brezillon:
      
       - Add a /* fall-through */ comment in the dw-i3c-master driver
      
       - Update the I3C entries in MAINTAINERS to add an IRC chan
      
      * tag 'i3c/for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        i3c: master: dw-i3c-master: mark expected switch fall-through
        MAINTAINERS: Add an IRC channel for the I3C subsystem
      c8d950ab
    • Linus Torvalds's avatar
      Merge tag 'mtd/for-5.1' of git://git.infradead.org/linux-mtd · 811c16a2
      Linus Torvalds authored
      Pull MTD updates from Boris Brezillon:
       "Core MTD changes:
         - Use struct_size() where appropriate
         - mtd_{read,write}() as wrappers around mtd_{read,write}_oob()
         - Fix misuse of PTR_ERR() in docg3
         - Coding style improvements in mtdcore.c
      
        SPI NOR changes:
          Core changes:
           - Add support of octal mode I/O transfer
           - Add a bunch of SPI NOR entries to the flash_info table
      
          SPI NOR controller driver changes:
           - cadence-quadspi:
              * Add support for Octal SPI controller
              * write upto 8-bytes data in STIG mode
           - mtk-quadspi:
              * rename config to a common one
              * add SNOR_HWCAPS_READ to spi_nor_hwcaps mask
           - Add Tudor as SPI-NOR co-maintainer
      
        NAND changes:
          NAND core changes:
           - Fourth batch of fixes/cleanup to the raw NAND core impacting
             various controller drivers (Sunxi, Marvell, MTK, TMIO, OMAP2).
           - Check the return code of nand_reset() and nand_readid_op().
           - Remove ->legacy.erase and single_erase().
           - Simplify the locking.
           - Several implicit fall through annotations.
      
          Raw NAND controllers drivers changes:
           - Fix various possible object reference leaks (MTK, JZ4780, Atmel)
           - ST:
              * Add support for STM32 FMC2 NAND flash controller
           - Meson:
              * Add support for Amlogic NAND flash controller
           - Denali:
              * Several cleanup patches
           - Sunxi:
              * Several cleanup patches
           - FSMC:
              * Disable NAND on remove()
              * Reset NAND timings on resume()
      
          SPI-NAND drivers changes:
           - Toshiba:
              * Add support for all Toshiba products.
           - Macronix:
              * Fix ECC status read.
           - Gigadevice:
              * Add support for GD5F1GQ4UExxG"
      
      * tag 'mtd/for-5.1' of git://git.infradead.org/linux-mtd: (64 commits)
        mtd: spi-nor: Fix wrong abbreviation HWCPAS
        mtd: spi-nor: cadence-quadspi: fix spelling mistake: "Couldnt't" -> "Couldn't"
        mtd: spi-nor: Add support for en25qh64
        mtd: spi-nor: Add support for MX25V8035F
        mtd: spi-nor: Add support for EN25Q80A
        mtd: spi-nor: cadence-quadspi: Add support for Octal SPI controller
        dt-bindings: cadence-quadspi: Add new compatible for AM654 SoC
        mtd: spi-nor: split s25fl128s into s25fl128s0 and s25fl128s1
        mtd: spi-nor: cadence-quadspi: write upto 8-bytes data in STIG mode
        mtd: spi-nor: Add support for mx25u3235f
        mtd: rawnand: denali_dt: remove single anonymous clock support
        mtd: rawnand: mtk: fix possible object reference leak
        mtd: rawnand: jz4780: fix possible object reference leak
        mtd: rawnand: atmel: fix possible object reference leak
        mtd: rawnand: fsmc: Disable NAND on remove()
        mtd: rawnand: fsmc: Reset NAND timings on resume()
        mtd: spinand: Add support for GigaDevice GD5F1GQ4UExxG
        mtd: rawnand: denali: remove unused dma_addr field from denali_nand_info
        mtd: rawnand: denali: remove unused function argument 'raw'
        mtd: rawnand: denali: remove unneeded denali_reset_irq() call
        ...
      811c16a2
    • Linus Torvalds's avatar
      Merge tag 'vfio-v5.1-rc1' of git://github.com/awilliam/linux-vfio · a83b0423
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Switch mdev to generic UUID API (Andy Shevchenko)
      
       - Fixup platform reset include paths (Masahiro Yamada)
      
       - Fix usage of MINORMASK (Chengguang Xu)
      
       - Remove noise from duplicate spapr table unsets (Alexey Kardashevskiy)
      
       - Restore device state after PM reset (Alex Williamson)
      
       - Ensure memory translation enabled for PCI ROM access (Eric Auger)
      
      * tag 'vfio-v5.1-rc1' of git://github.com/awilliam/linux-vfio:
        vfio_pci: Enable memory accesses before calling pci_map_rom
        vfio/pci: Restore device state on PM transition
        vfio/spapr_tce: Skip unsetting already unset table
        samples/vfio-mdev/mtty: expand minor range when registering chrdev region
        samples/vfio-mdev/mdpy: expand minor range when registering chrdev region
        samples/vfio-mdev/mbochs: expand minor range when registering chrdev region
        vfio: expand minor range when registering chrdev region
        vfio: platform: reset: fix up include directives to remove ccflags-y
        vfio-mdev: Switch to use new generic UUID API
      a83b0423
    • Slavomir Kaslev's avatar
      fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes · ee5e0011
      Slavomir Kaslev authored
      The current implementation of splice() and tee() ignores O_NONBLOCK set
      on pipe file descriptors and checks only the SPLICE_F_NONBLOCK flag for
      blocking on pipe arguments.  This is inconsistent since splice()-ing
      from/to non-pipe file descriptors does take O_NONBLOCK into
      consideration.
      
      Fix this by promoting O_NONBLOCK, when set on a pipe, to
      SPLICE_F_NONBLOCK.
      
      Some context for how the current implementation of splice() leads to
      inconsistent behavior.  In the ongoing work[1] to add VM tracing
      capability to trace-cmd we stream tracing data over named FIFOs or
      vsockets from guests back to the host.
      
      When we receive SIGINT from user to stop tracing, we set O_NONBLOCK on
      the input file descriptor and set SPLICE_F_NONBLOCK for the next call to
      splice().  If splice() was blocked waiting on data from the input FIFO,
      after SIGINT splice() restarts with the same arguments (no
      SPLICE_F_NONBLOCK) and blocks again instead of returning -EAGAIN when no
      data is available.
      
      This differs from the splice() behavior when reading from a vsocket or
      when we're doing a traditional read()/write() loop (trace-cmd's
      --nosplice argument).
      
      With this patch applied we get the same behavior in all situations after
      setting O_NONBLOCK which also matches the behavior of doing a
      read()/write() loop instead of splice().
      
      This change does have potential of breaking users who don't expect
      EAGAIN from splice() when SPLICE_F_NONBLOCK is not set.  OTOH programs
      that set O_NONBLOCK and don't anticipate EAGAIN are arguably buggy[2].
      
       [1] https://github.com/skaslev/trace-cmd/tree/vsock
       [2] https://github.com/torvalds/linux/blob/d47e3da1759230e394096fd742aad423c291ba48/fs/read_write.c#L1425Signed-off-by: default avatarSlavomir Kaslev <kaslevs@vmware.com>
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ee5e0011
  4. 04 Mar, 2019 3 commits