1. 23 Mar, 2019 28 commits
  2. 19 Mar, 2019 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.164 · f5fd34f0
      Greg Kroah-Hartman authored
      f5fd34f0
    • Zha Bin's avatar
      vhost/vsock: fix vhost vsock cid hashing inconsistent · 5ebcee94
      Zha Bin authored
      commit 7fbe078c upstream.
      
      The vsock core only supports 32bit CID, but the Virtio-vsock spec define
      CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
      zero. This inconsistency causes one bug in vhost vsock driver. The
      scenarios is:
      
        0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
        object. And hash_min() is used to compute the hash key. hash_min() is
        defined as:
        (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)).
        That means the hash algorithm has dependency on the size of macro
        argument 'val'.
        0. In function vhost_vsock_set_cid(), a 64bit CID is passed to
        hash_min() to compute the hash key when inserting a vsock object into
        the hash table.
        0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min()
        to compute the hash key when looking up a vsock for an CID.
      
      Because the different size of the CID, hash_min() returns different hash
      key, thus fails to look up the vsock object for an CID.
      
      To fix this bug, we keep CID as u64 in the IOCTLs and virtio message
      headers, but explicitly convert u64 to u32 when deal with the hash table
      and vsock core.
      
      Fixes: 834e772c ("vhost/vsock: fix use-after-free in network stack callers")
      Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.texSigned-off-by: default avatarZha Bin <zhabin@linux.alibaba.com>
      Reviewed-by: default avatarLiu Jiang <gerry@linux.alibaba.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarShengjing Zhu <i@zhsj.me>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      5ebcee94
    • Sakari Ailus's avatar
      of: Support const and non-const use for to_of_node() · 0da773c5
      Sakari Ailus authored
      commit d20dc149 upstream.
      
      Turn to_of_node() into a macro in order to support both const and
      non-const use. Additionally make the fwnode argument to is_of_node() const
      as well.
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Reviewed-by: default avatarKieran Bingham <kieran.bingham+renesas@ideasonboard.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0da773c5
    • Sergei Shtylyov's avatar
      mmc: tmio_mmc_core: don't claim spurious interrupts · 883f7c32
      Sergei Shtylyov authored
      commit 5c27ff5d upstream.
      
      I have encountered an interrupt storm during the eMMC chip probing (and
      the chip finally didn't get detected).  It turned out that U-Boot left
      the DMAC interrupts enabled while the Linux driver  didn't use those.
      The SDHI driver's interrupt handler somehow assumes that, even if an
      SDIO interrupt didn't happen, it should return IRQ_HANDLED.  I think
      that if none of the enabled interrupts happened and got handled, we
      should return IRQ_NONE -- that way the kernel IRQ code recoginizes
      a spurious interrupt and masks it off pretty quickly...
      
      Fixes: 7729c7a2 ("mmc: tmio: Provide separate interrupt handlers")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Reviewed-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      883f7c32
    • Xiao Ni's avatar
      It's wrong to add len to sector_nr in raid10 reshape twice · 3decc9df
      Xiao Ni authored
      commit b761dcf1 upstream.
      
      In reshape_request it already adds len to sector_nr already. It's wrong to add len to
      sector_nr again after adding pages to bio. If there is bad block it can't copy one chunk
      at a time, it needs to goto read_more. Now the sector_nr is wrong. It can cause data
      corruption.
      
      Cc: stable@vger.kernel.org # v3.16+
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3decc9df
    • Takashi Sakamoto's avatar
      ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against Liquid Saffire 56 · a3a870c0
      Takashi Sakamoto authored
      commit 7dc661bd upstream.
      
      ALSA bebob driver has an entry for Focusrite Saffire Pro 10 I/O. The
      entry matches vendor_id in root directory and model_id in unit
      directory of configuration ROM for IEEE 1394 bus.
      
      On the other hand, configuration ROM of Focusrite Liquid Saffire 56
      has the same vendor_id and model_id. This device is an application of
      TCAT Dice (TCD2220 a.k.a Dice Jr.) however ALSA bebob driver can be
      bound to it randomly instead of ALSA dice driver. At present, drivers
      in ALSA firewire stack can not handle this situation appropriately.
      
      This commit uses more identical mod_alias for Focusrite Saffire Pro 10
      I/O in ALSA bebob driver.
      
      $ python2 crpp < /sys/bus/firewire/devices/fw1/config_rom
                     ROM header and bus information block
                     -----------------------------------------------------------------
      400  042a829d  bus_info_length 4, crc_length 42, crc 33437
      404  31333934  bus_name "1394"
      408  f0649222  irmc 1, cmc 1, isc 1, bmc 1, pmc 0, cyc_clk_acc 100,
                     max_rec 9 (1024), max_rom 2, gen 2, spd 2 (S400)
      40c  00130e01  company_id 00130e     |
      410  000606e0  device_id 01000606e0  | EUI-64 00130e01000606e0
      
                     root directory
                     -----------------------------------------------------------------
      414  0009d31c  directory_length 9, crc 54044
      418  04000014  hardware version
      41c  0c0083c0  node capabilities per IEEE 1394
      420  0300130e  vendor
      424  81000012  --> descriptor leaf at 46c
      428  17000006  model
      42c  81000016  --> descriptor leaf at 484
      430  130120c2  version
      434  d1000002  --> unit directory at 43c
      438  d4000006  --> dependent info directory at 450
      
                     unit directory at 43c
                     -----------------------------------------------------------------
      43c  0004707c  directory_length 4, crc 28796
      440  1200a02d  specifier id: 1394 TA
      444  13010001  version: AV/C
      448  17000006  model
      44c  81000013  --> descriptor leaf at 498
      
                     dependent info directory at 450
                     -----------------------------------------------------------------
      450  000637c7  directory_length 6, crc 14279
      454  120007f5  specifier id
      458  13000001  version
      45c  3affffc7  (immediate value)
      460  3b100000  (immediate value)
      464  3cffffc7  (immediate value)
      468  3d600000  (immediate value)
      
                     descriptor leaf at 46c
                     -----------------------------------------------------------------
      46c  00056f3b  leaf_length 5, crc 28475
      470  00000000  textual descriptor
      474  00000000  minimal ASCII
      478  466f6375  "Focu"
      47c  73726974  "srit"
      480  65000000  "e"
      
                     descriptor leaf at 484
                     -----------------------------------------------------------------
      484  0004a165  leaf_length 4, crc 41317
      488  00000000  textual descriptor
      48c  00000000  minimal ASCII
      490  50726f31  "Pro1"
      494  30494f00  "0IO"
      
                     descriptor leaf at 498
                     -----------------------------------------------------------------
      498  0004a165  leaf_length 4, crc 41317
      49c  00000000  textual descriptor
      4a0  00000000  minimal ASCII
      4a4  50726f31  "Pro1"
      4a8  30494f00  "0IO"
      
      $ python2 crpp < /sys/bus/firewire/devices/fw1/config_rom
                     ROM header and bus information block
                     -----------------------------------------------------------------
      400  040442e4  bus_info_length 4, crc_length 4, crc 17124
      404  31333934  bus_name "1394"
      408  e0ff8112  irmc 1, cmc 1, isc 1, bmc 0, pmc 0, cyc_clk_acc 255,
                     max_rec 8 (512), max_rom 1, gen 1, spd 2 (S400)
      40c  00130e04  company_id 00130e     |
      410  018001e9  device_id 04018001e9  | EUI-64 00130e04018001e9
      
                     root directory
                     -----------------------------------------------------------------
      414  00065612  directory_length 6, crc 22034
      418  0300130e  vendor
      41c  8100000a  --> descriptor leaf at 444
      420  17000006  model
      424  8100000e  --> descriptor leaf at 45c
      428  0c0087c0  node capabilities per IEEE 1394
      42c  d1000001  --> unit directory at 430
      
                     unit directory at 430
                     -----------------------------------------------------------------
      430  000418a0  directory_length 4, crc 6304
      434  1200130e  specifier id
      438  13000001  version
      43c  17000006  model
      440  8100000f  --> descriptor leaf at 47c
      
                     descriptor leaf at 444
                     -----------------------------------------------------------------
      444  00056f3b  leaf_length 5, crc 28475
      448  00000000  textual descriptor
      44c  00000000  minimal ASCII
      450  466f6375  "Focu"
      454  73726974  "srit"
      458  65000000  "e"
      
                     descriptor leaf at 45c
                     -----------------------------------------------------------------
      45c  000762c6  leaf_length 7, crc 25286
      460  00000000  textual descriptor
      464  00000000  minimal ASCII
      468  4c495155  "LIQU"
      46c  49445f53  "ID_S"
      470  41464649  "AFFI"
      474  52455f35  "RE_5"
      478  36000000  "6"
      
                     descriptor leaf at 47c
                     -----------------------------------------------------------------
      47c  000762c6  leaf_length 7, crc 25286
      480  00000000  textual descriptor
      484  00000000  minimal ASCII
      488  4c495155  "LIQU"
      48c  49445f53  "ID_S"
      490  41464649  "AFFI"
      494  52455f35  "RE_5"
      498  36000000  "6"
      
      Cc: <stable@vger.kernel.org> # v3.16+
      Fixes: 25784ec2 ("ALSA: bebob: Add support for Focusrite Saffire/SaffirePro series")
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3a870c0
    • Peter Zijlstra's avatar
      perf/x86: Fixup typo in stub functions · 4f964aa5
      Peter Zijlstra authored
      commit f764c58b upstream.
      
      Guenter reported a build warning for CONFIG_CPU_SUP_INTEL=n:
      
        > With allmodconfig-CONFIG_CPU_SUP_INTEL, this patch results in:
        >
        > In file included from arch/x86/events/amd/core.c:8:0:
        > arch/x86/events/amd/../perf_event.h:1036:45: warning: ‘struct cpu_hw_event’ declared inside parameter list will not be visible outside of this definition or declaration
        >  static inline int intel_cpuc_prepare(struct cpu_hw_event *cpuc, int cpu)
      
      While harmless (an unsed pointer is an unused pointer, no matter the type)
      it needs fixing.
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: d01b1f96 ("perf/x86/intel: Make cpuc allocations consistent")
      Link: http://lkml.kernel.org/r/20190315081410.GR5996@hirez.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4f964aa5
    • Eric Dumazet's avatar
      tcp/dccp: remove reqsk_put() from inet_child_forget() · 83fe8732
      Eric Dumazet authored
      commit da8ab578 upstream.
      
      Back in linux-4.4, I inadvertently put a call to reqsk_put() in
      inet_child_forget(), forgetting it could be called from two different
      points.
      
      In the case it is called from inet_csk_reqsk_queue_add(), we want to
      keep the reference on the request socket, since it is released later by
      the caller (tcp_v{4|6}_rcv())
      
      This bug never showed up because atomic_dec_and_test() was not signaling
      the underflow, and SLAB_DESTROY_BY RCU semantic for request sockets
      prevented the request to be put in quarantine.
      
      Recent conversion of socket refcount from atomic_t to refcount_t finally
      exposed the bug.
      
      So move the reqsk_put() to inet_csk_listen_stop() to fix this.
      
      Thanks to Shankara Pailoor for using syzkaller and providing
      a nice set of .config and C repro.
      
      WARNING: CPU: 2 PID: 4277 at lib/refcount.c:186
      refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 2 PID: 4277 Comm: syz-executor0 Not tainted 4.13.0-rc7 #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      Ubuntu-1.8.2-1ubuntu1 04/01/2014
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0xf7/0x1aa lib/dump_stack.c:52
       panic+0x1ae/0x3a7 kernel/panic.c:180
       __warn+0x1c4/0x1d9 kernel/panic.c:541
       report_bug+0x211/0x2d0 lib/bug.c:183
       fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
       do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
       do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
       do_error_trap+0x118/0x340 arch/x86/kernel/traps.c:310
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
       invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:846
      RIP: 0010:refcount_sub_and_test+0x167/0x1b0 lib/refcount.c:186
      RSP: 0018:ffff88006e006b60 EFLAGS: 00010286
      RAX: 0000000000000026 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000026 RSI: 1ffff1000dc00d2c RDI: ffffed000dc00d60
      RBP: ffff88006e006bf0 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1000dc00d6d
      R13: 00000000ffffffff R14: 0000000000000001 R15: ffff88006ce9d340
       refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
       reqsk_put+0x71/0x2b0 include/net/request_sock.h:123
       tcp_v4_rcv+0x259e/0x2e20 net/ipv4/tcp_ipv4.c:1729
       ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
       NF_HOOK include/linux/netfilter.h:248 [inline]
       ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
       dst_input include/net/dst.h:477 [inline]
       ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
       NF_HOOK include/linux/netfilter.h:248 [inline]
       ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
       __netif_receive_skb_core+0x1fb7/0x31f0 net/core/dev.c:4298
       __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4336
       process_backlog+0x1c5/0x6d0 net/core/dev.c:5102
       napi_poll net/core/dev.c:5499 [inline]
       net_rx_action+0x6d3/0x14a0 net/core/dev.c:5565
       __do_softirq+0x2cb/0xb2d kernel/softirq.c:284
       do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:898
       </IRQ>
       do_softirq.part.16+0x63/0x80 kernel/softirq.c:328
       do_softirq kernel/softirq.c:176 [inline]
       __local_bh_enable_ip+0x84/0x90 kernel/softirq.c:181
       local_bh_enable include/linux/bottom_half.h:31 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:705 [inline]
       ip_finish_output2+0x8ad/0x1360 net/ipv4/ip_output.c:231
       ip_finish_output+0x74e/0xb80 net/ipv4/ip_output.c:317
       NF_HOOK_COND include/linux/netfilter.h:237 [inline]
       ip_output+0x1cc/0x850 net/ipv4/ip_output.c:405
       dst_output include/net/dst.h:471 [inline]
       ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
       ip_queue_xmit+0x8c6/0x1810 net/ipv4/ip_output.c:504
       tcp_transmit_skb+0x1963/0x3320 net/ipv4/tcp_output.c:1123
       tcp_send_ack.part.35+0x38c/0x620 net/ipv4/tcp_output.c:3575
       tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3545
       tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:5795 [inline]
       tcp_rcv_state_process+0x4876/0x4b60 net/ipv4/tcp_input.c:5930
       tcp_v4_do_rcv+0x58a/0x820 net/ipv4/tcp_ipv4.c:1483
       sk_backlog_rcv include/net/sock.h:907 [inline]
       __release_sock+0x124/0x360 net/core/sock.c:2223
       release_sock+0xa4/0x2a0 net/core/sock.c:2715
       inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
       __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
       inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
       SYSC_connect+0x204/0x470 net/socket.c:1628
       SyS_connect+0x24/0x30 net/socket.c:1609
       entry_SYSCALL_64_fastpath+0x18/0xad
      RIP: 0033:0x451e59
      RSP: 002b:00007f474843fc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 0000000000451e59
      RDX: 0000000000000010 RSI: 0000000020002000 RDI: 0000000000000007
      RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000000000
      R13: 00007ffc040a0f8f R14: 00007f47484409c0 R15: 0000000000000000
      
      Fixes: ebb516af ("tcp/dccp: fix race at listener dismantle phase")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarShankara Pailoor <sp3485@columbia.edu>
      Tested-by: default avatarShankara Pailoor <sp3485@columbia.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83fe8732
    • Eric Dumazet's avatar
      gro_cells: make sure device is up in gro_cells_receive() · 7cbb0ab1
      Eric Dumazet authored
      [ Upstream commit 2a5ff07a ]
      
      We keep receiving syzbot reports [1] that show that tunnels do not play
      the rcu/IFF_UP rules properly.
      
      At device dismantle phase, gro_cells_destroy() will be called
      only after a full rcu grace period is observed after IFF_UP
      has been cleared.
      
      This means that IFF_UP needs to be tested before queueing packets
      into netif_rx() or gro_cells.
      
      This patch implements the test in gro_cells_receive() because
      too many callers do not seem to bother enough.
      
      [1]
      BUG: unable to handle kernel paging request at fffff4ca0b9ffffe
      PGD 0 P4D 0
      Oops: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.0.0+ #97
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      RIP: 0010:__skb_unlink include/linux/skbuff.h:1929 [inline]
      RIP: 0010:__skb_dequeue include/linux/skbuff.h:1945 [inline]
      RIP: 0010:__skb_queue_purge include/linux/skbuff.h:2656 [inline]
      RIP: 0010:gro_cells_destroy net/core/gro_cells.c:89 [inline]
      RIP: 0010:gro_cells_destroy+0x19d/0x360 net/core/gro_cells.c:78
      Code: 03 42 80 3c 20 00 0f 85 53 01 00 00 48 8d 7a 08 49 8b 47 08 49 c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 <42> 80 3c 21 00 0f 85 10 01 00 00 48 89 c1 48 89 42 08 48 c1 e9 03
      RSP: 0018:ffff8880aa3f79a8 EFLAGS: 00010a02
      RAX: 00ffffffffffffe8 RBX: ffffe8ffffc64b70 RCX: 1ffff8ca0b9ffffe
      RDX: ffffc6505cffffe8 RSI: ffffffff858410ca RDI: ffffc6505cfffff0
      RBP: ffff8880aa3f7a08 R08: ffff8880aa3e8580 R09: fffffbfff1263645
      R10: fffffbfff1263644 R11: ffffffff8931b223 R12: dffffc0000000000
      R13: 0000000000000000 R14: ffffe8ffffc64b80 R15: ffffe8ffffc64b75
      kobject: 'loop2' (000000004bd7d84a): kobject_uevent_env
      FS:  0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: fffff4ca0b9ffffe CR3: 0000000094941000 CR4: 00000000001406f0
      Call Trace:
      kobject: 'loop2' (000000004bd7d84a): fill_kobj_path: path = '/devices/virtual/block/loop2'
       ip_tunnel_dev_free+0x19/0x60 net/ipv4/ip_tunnel.c:1010
       netdev_run_todo+0x51c/0x7d0 net/core/dev.c:8970
       rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:116
       ip_tunnel_delete_nets+0x423/0x5f0 net/ipv4/ip_tunnel.c:1124
       vti_exit_batch_net+0x23/0x30 net/ipv4/ip_vti.c:495
       ops_exit_list.isra.0+0x105/0x160 net/core/net_namespace.c:156
       cleanup_net+0x3fb/0x960 net/core/net_namespace.c:551
       process_one_work+0x98e/0x1790 kernel/workqueue.c:2173
       worker_thread+0x98/0xe40 kernel/workqueue.c:2319
       kthread+0x357/0x430 kernel/kthread.c:246
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      Modules linked in:
      CR2: fffff4ca0b9ffffe
         [ end trace 513fc9c1338d1cb3 ]
      RIP: 0010:__skb_unlink include/linux/skbuff.h:1929 [inline]
      RIP: 0010:__skb_dequeue include/linux/skbuff.h:1945 [inline]
      RIP: 0010:__skb_queue_purge include/linux/skbuff.h:2656 [inline]
      RIP: 0010:gro_cells_destroy net/core/gro_cells.c:89 [inline]
      RIP: 0010:gro_cells_destroy+0x19d/0x360 net/core/gro_cells.c:78
      Code: 03 42 80 3c 20 00 0f 85 53 01 00 00 48 8d 7a 08 49 8b 47 08 49 c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 <42> 80 3c 21 00 0f 85 10 01 00 00 48 89 c1 48 89 42 08 48 c1 e9 03
      RSP: 0018:ffff8880aa3f79a8 EFLAGS: 00010a02
      RAX: 00ffffffffffffe8 RBX: ffffe8ffffc64b70 RCX: 1ffff8ca0b9ffffe
      RDX: ffffc6505cffffe8 RSI: ffffffff858410ca RDI: ffffc6505cfffff0
      RBP: ffff8880aa3f7a08 R08: ffff8880aa3e8580 R09: fffffbfff1263645
      R10: fffffbfff1263644 R11: ffffffff8931b223 R12: dffffc0000000000
      kobject: 'loop3' (00000000e4ee57a6): kobject_uevent_env
      R13: 0000000000000000 R14: ffffe8ffffc64b80 R15: ffffe8ffffc64b75
      FS:  0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: fffff4ca0b9ffffe CR3: 0000000094941000 CR4: 00000000001406f0
      
      Fixes: c9e6bc64 ("net: add gro_cells infrastructure")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cbb0ab1
    • David Howells's avatar
      rxrpc: Fix client call queueing, waiting for channel · 0bea3824
      David Howells authored
      [ Upstream commit 69ffaebb ]
      
      rxrpc_get_client_conn() adds a new call to the front of the waiting_calls
      queue if the connection it's going to use already exists.  This is bad as
      it allows calls to get starved out.
      
      Fix this by adding to the tail instead.
      
      Also change the other enqueue point in the same function to put it on the
      front (ie. when we have a new connection).  This makes the point that in
      the case of a new connection the new call goes at the front (though it
      doesn't actually matter since the queue should be unoccupied).
      
      Fixes: 45025bce ("rxrpc: Improve management and caching of client connection objects")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0bea3824
    • Stefano Brivio's avatar
      vxlan: Fix GRO cells race condition between receive and link delete · 8fa3e879
      Stefano Brivio authored
      [ Upstream commit ad6c9986 ]
      
      If we receive a packet while deleting a VXLAN device, there's a chance
      vxlan_rcv() is called at the same time as vxlan_dellink(). This is fine,
      except that vxlan_dellink() should never ever touch stuff that's still in
      use, such as the GRO cells list.
      
      Otherwise, vxlan_rcv() crashes while queueing packets via
      gro_cells_receive().
      
      Move the gro_cells_destroy() to vxlan_uninit(), which runs after the RCU
      grace period is elapsed and nothing needs the gro_cells anymore.
      
      This is now done in the same way as commit 8e816df8 ("geneve: Use GRO
      cells infrastructure.") originally implemented for GENEVE.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Fixes: 58ce31cc ("vxlan: GRO support at tunnel layer")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8fa3e879
    • Daniel Borkmann's avatar
      ipvlan: disallow userns cap_net_admin to change global mode/flags · 510c6252
      Daniel Borkmann authored
      [ Upstream commit 7cc9f700 ]
      
      When running Docker with userns isolation e.g. --userns-remap="default"
      and spawning up some containers with CAP_NET_ADMIN under this realm, I
      noticed that link changes on ipvlan slave device inside that container
      can affect all devices from this ipvlan group which are in other net
      namespaces where the container should have no permission to make changes
      to, such as the init netns, for example.
      
      This effectively allows to undo ipvlan private mode and switch globally to
      bridge mode where slaves can communicate directly without going through
      hostns, or it allows to switch between global operation mode (l2/l3/l3s)
      for everyone bound to the given ipvlan master device. libnetwork plugin
      here is creating an ipvlan master and ipvlan slave in hostns and a slave
      each that is moved into the container's netns upon creation event.
      
      * In hostns:
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
           link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
           ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
           inet 10.41.0.1/32 scope link cilium_host
             valid_lft forever preferred_lft forever
        [...]
      
      * Spawn container & change ipvlan mode setting inside of it:
      
        # docker run -dt --cap-add=NET_ADMIN --network cilium-net --name client -l app=test cilium/netperf
        9fff485d69dcb5ce37c9e33ca20a11ccafc236d690105aadbfb77e4f4170879c
      
        # docker exec -ti client ip -d a
        [...]
        10: cilium0@if4: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
        # docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2
      
        # docker exec -ti client ip -d a
        [...]
        10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
      * In hostns (mode switched to l2):
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.0.1/32 scope link cilium_host
               valid_lft forever preferred_lft forever
        [...]
      
      Same l3 -> l2 switch would also happen by creating another slave inside
      the container's network namespace when specifying the existing cilium0
      link to derive the actual (bond0) master:
      
        # docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2
      
        # docker exec -ti client ip -d a
        [...]
        2: cilium1@if4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
        10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
               valid_lft forever preferred_lft forever
      
      * In hostns:
      
        # ip -d a
        [...]
        8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
            link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
            ipvlan  mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
            inet 10.41.0.1/32 scope link cilium_host
               valid_lft forever preferred_lft forever
        [...]
      
      One way to mitigate it is to check CAP_NET_ADMIN permissions of
      the ipvlan master device's ns, and only then allow to change
      mode or flags for all devices bound to it. Above two cases are
      then disallowed after the patch.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      510c6252