1. 03 Apr, 2019 21 commits
    • Eric Dumazet's avatar
      tcp: do not use ipv6 header for ipv4 flow · 7115df61
      Eric Dumazet authored
      [ Upstream commit 89e41309 ]
      
      When a dual stack tcp listener accepts an ipv4 flow,
      it should not attempt to use an ipv6 header or tcp_v6_iif() helper.
      
      Fixes: 1397ed35 ("ipv6: add flowinfo for tcp6 pkt_options for all cases")
      Fixes: df3687ff ("ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7115df61
    • Xin Long's avatar
      sctp: use memdup_user instead of vmemdup_user · cab576f1
      Xin Long authored
      [ Upstream commit ef82bcfa ]
      
      In sctp_setsockopt_bindx()/__sctp_setsockopt_connectx(), it allocates
      memory with addrs_size which is passed from userspace. We used flag
      GFP_USER to put some more restrictions on it in Commit cacc0621
      ("sctp: use GFP_USER for user-controlled kmalloc").
      
      However, since Commit c981f254 ("sctp: use vmemdup_user() rather
      than badly open-coding memdup_user()"), vmemdup_user() has been used,
      which doesn't check GFP_USER flag when goes to vmalloc_*(). So when
      addrs_size is a huge value, it could exhaust memory and even trigger
      oom killer.
      
      This patch is to use memdup_user() instead, in which GFP_USER would
      work to limit the memory allocation with a huge addrs_size.
      
      Note we can't fix it by limiting 'addrs_size', as there's no demand
      for it from RFC.
      
      Reported-by: syzbot+ec1b7575afef85a0e5ca@syzkaller.appspotmail.com
      Fixes: c981f254 ("sctp: use vmemdup_user() rather than badly open-coding memdup_user()")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cab576f1
    • Xin Long's avatar
      sctp: get sctphdr by offset in sctp_compute_cksum · 97265479
      Xin Long authored
      [ Upstream commit 273160ff ]
      
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97265479
    • Herbert Xu's avatar
      rhashtable: Still do rehash when we get EEXIST · cf86f7a9
      Herbert Xu authored
      [ Upstream commit 408f13ef ]
      
      As it stands if a shrink is delayed because of an outstanding
      rehash, we will go into a rescheduling loop without ever doing
      the rehash.
      
      This patch fixes this by still carrying out the rehash and then
      rescheduling so that we can shrink after the completion of the
      rehash should it still be necessary.
      
      The return value of EEXIST captures this case and other cases
      (e.g., another thread expanded/rehashed the table at the same
      time) where we should still proceed with the rehash.
      
      Fixes: da20420f ("rhashtable: Add nested tables")
      Reported-by: default avatarJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: default avatarJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf86f7a9
    • Maxime Chevallier's avatar
      packets: Always register packet sk in the same order · 69cea7cf
      Maxime Chevallier authored
      [ Upstream commit a4dc6a49 ]
      
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69cea7cf
    • YueHaibing's avatar
      net-sysfs: call dev_hold if kobject_init_and_add success · d9d215be
      YueHaibing authored
      [ Upstream commit a3e23f71 ]
      
      In netdev_queue_add_kobject and rx_queue_add_kobject,
      if sysfs_create_group failed, kobject_put will call
      netdev_queue_release to decrease dev refcont, however
      dev_hold has not be called. So we will see this while
      unregistering dev:
      
      unregister_netdevice: waiting for bcsh0 to become free. Usage count = -1
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: d0d66837 ("net: don't decrement kobj reference count on init failure")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9d215be
    • Aaro Koskinen's avatar
      net: stmmac: fix memory corruption with large MTUs · 8dcf078d
      Aaro Koskinen authored
      [ Upstream commit 223a960c ]
      
      When using 16K DMA buffers and ring mode, the DES3 refill is not working
      correctly as the function is using a bogus pointer for checking the
      private data. As a result stale pointers will remain in the RX descriptor
      ring, so DMA will now likely overwrite/corrupt some already freed memory.
      
      As simple reproducer, just receive some UDP traffic:
      
      	# ifconfig eth0 down; ifconfig eth0 mtu 9000; ifconfig eth0 up
      	# iperf3 -c 192.168.253.40 -u -b 0 -R
      
      If you didn't crash by now check the RX descriptors to find non-contiguous
      RX buffers:
      
      	cat /sys/kernel/debug/stmmaceth/eth0/descriptors_status
      	[...]
      	1 [0x2be5020]: 0xa3220321 0x9ffc1ffc 0x72d70082 0x130e207e
      					     ^^^^^^^^^^^^^^^^^^^^^
      	2 [0x2be5040]: 0xa3220321 0x9ffc1ffc 0x72998082 0x1311a07e
      					     ^^^^^^^^^^^^^^^^^^^^^
      
      A simple ping test will now report bad data:
      
      	# ping -s 8200 192.168.253.40
      	PING 192.168.253.40 (192.168.253.40) 8200(8228) bytes of data.
      	8208 bytes from 192.168.253.40: icmp_seq=1 ttl=64 time=1.00 ms
      	wrong data byte #8144 should be 0xd0 but was 0x88
      
      Fix the wrong pointer. Also we must refill DES3 only if the DMA buffer
      size is 16K.
      
      Fixes: 54139cf3 ("net: stmmac: adding multiple buffers for rx")
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@nokia.com>
      Acked-by: default avatarJose Abreu <joabreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dcf078d
    • Eric Dumazet's avatar
      net: rose: fix a possible stack overflow · 7eeb12ed
      Eric Dumazet authored
      [ Upstream commit e5dcc0c3 ]
      
      rose_write_internal() uses a temp buffer of 100 bytes, but a manual
      inspection showed that given arbitrary input, rose_create_facilities()
      can fill up to 110 bytes.
      
      Lets use a tailroom of 256 bytes for peace of mind, and remove
      the bounce buffer : we can simply allocate a big enough skb
      and adjust its length as needed.
      
      syzbot report :
      
      BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline]
      BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline]
      BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
      Write of size 7 at addr ffff88808b1ffbef by task syz-executor.0/24854
      
      CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       check_memory_region_inline mm/kasan/generic.c:185 [inline]
       check_memory_region+0x123/0x190 mm/kasan/generic.c:191
       memcpy+0x38/0x50 mm/kasan/common.c:131
       memcpy include/linux/string.h:352 [inline]
       rose_create_facilities net/rose/rose_subr.c:521 [inline]
       rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
       rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826
       __sys_connect+0x266/0x330 net/socket.c:1685
       __do_sys_connect net/socket.c:1696 [inline]
       __se_sys_connect net/socket.c:1693 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1693
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x458079
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f47b8d9dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079
      RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000004
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f47b8d9e6d4
      R13: 00000000004be4a4 R14: 00000000004ceca8 R15: 00000000ffffffff
      
      The buggy address belongs to the page:
      page:ffffea00022c7fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0x1fffc0000000000()
      raw: 01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000
      raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03
      >ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3
                                                                   ^
       ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7eeb12ed
    • Jerome Brunet's avatar
      net: phy: meson-gxl: fix interrupt support · a6f0168e
      Jerome Brunet authored
      [ Upstream commit daa5c4d0 ]
      
      If an interrupt is already pending when the interrupt is enabled on the
      GXL phy, no IRQ will ever be triggered.
      
      The fix is simply to make sure pending IRQs are cleared before setting
      up the irq mask.
      
      Fixes: cf127ff2 ("net: phy: meson-gxl: add interrupt support")
      Signed-off-by: default avatarJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6f0168e
    • Christoph Paasch's avatar
      net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec · 85ef72d8
      Christoph Paasch authored
      [ Upstream commit 398f0132 ]
      
      Since commit fc62814d ("net/packet: fix 4gb buffer limit due to overflow check")
      one can now allocate packet ring buffers >= UINT_MAX. However, syzkaller
      found that that triggers a warning:
      
      [   21.100000] WARNING: CPU: 2 PID: 2075 at mm/page_alloc.c:4584 __alloc_pages_nod0
      [   21.101490] Modules linked in:
      [   21.101921] CPU: 2 PID: 2075 Comm: syz-executor.0 Not tainted 5.0.0 #146
      [   21.102784] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   21.103887] RIP: 0010:__alloc_pages_nodemask+0x2a0/0x630
      [   21.104640] Code: fe ff ff 65 48 8b 04 25 c0 de 01 00 48 05 90 0f 00 00 41 bd 01 00 00 00 48 89 44 24 48 e9 9c fe 3
      [   21.107121] RSP: 0018:ffff88805e1cf920 EFLAGS: 00010246
      [   21.107819] RAX: 0000000000000000 RBX: ffffffff85a488a0 RCX: 0000000000000000
      [   21.108753] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000
      [   21.109699] RBP: 1ffff1100bc39f28 R08: ffffed100bcefb67 R09: ffffed100bcefb67
      [   21.110646] R10: 0000000000000001 R11: ffffed100bcefb66 R12: 000000000000000d
      [   21.111623] R13: 0000000000000000 R14: ffff88805e77d888 R15: 000000000000000d
      [   21.112552] FS:  00007f7c7de05700(0000) GS:ffff88806d100000(0000) knlGS:0000000000000000
      [   21.113612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   21.114405] CR2: 000000000065c000 CR3: 000000005e58e006 CR4: 00000000001606e0
      [   21.115367] Call Trace:
      [   21.115705]  ? __alloc_pages_slowpath+0x21c0/0x21c0
      [   21.116362]  alloc_pages_current+0xac/0x1e0
      [   21.116923]  kmalloc_order+0x18/0x70
      [   21.117393]  kmalloc_order_trace+0x18/0x110
      [   21.117949]  packet_set_ring+0x9d5/0x1770
      [   21.118524]  ? packet_rcv_spkt+0x440/0x440
      [   21.119094]  ? lock_downgrade+0x620/0x620
      [   21.119646]  ? __might_fault+0x177/0x1b0
      [   21.120177]  packet_setsockopt+0x981/0x2940
      [   21.120753]  ? __fget+0x2fb/0x4b0
      [   21.121209]  ? packet_release+0xab0/0xab0
      [   21.121740]  ? sock_has_perm+0x1cd/0x260
      [   21.122297]  ? selinux_secmark_relabel_packet+0xd0/0xd0
      [   21.123013]  ? __fget+0x324/0x4b0
      [   21.123451]  ? selinux_netlbl_socket_setsockopt+0x101/0x320
      [   21.124186]  ? selinux_netlbl_sock_rcv_skb+0x3a0/0x3a0
      [   21.124908]  ? __lock_acquire+0x529/0x3200
      [   21.125453]  ? selinux_socket_setsockopt+0x5d/0x70
      [   21.126075]  ? __sys_setsockopt+0x131/0x210
      [   21.126533]  ? packet_release+0xab0/0xab0
      [   21.127004]  __sys_setsockopt+0x131/0x210
      [   21.127449]  ? kernel_accept+0x2f0/0x2f0
      [   21.127911]  ? ret_from_fork+0x8/0x50
      [   21.128313]  ? do_raw_spin_lock+0x11b/0x280
      [   21.128800]  __x64_sys_setsockopt+0xba/0x150
      [   21.129271]  ? lockdep_hardirqs_on+0x37f/0x560
      [   21.129769]  do_syscall_64+0x9f/0x450
      [   21.130182]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      We should allocate with __GFP_NOWARN to handle this.
      
      Cc: Kal Conley <kal.conley@dectris.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Fixes: fc62814d ("net/packet: fix 4gb buffer limit due to overflow check")
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85ef72d8
    • Paolo Abeni's avatar
      net: datagram: fix unbounded loop in __skb_try_recv_datagram() · 88c64f9c
      Paolo Abeni authored
      [ Upstream commit 0b91bce1 ]
      
      Christoph reported a stall while peeking datagram with an offset when
      busy polling is enabled. __skb_try_recv_datagram() uses as the loop
      termination condition 'queue empty'. When peeking, the socket
      queue can be not empty, even when no additional packets are received.
      
      Address the issue explicitly checking for receive queue changes,
      as currently done by __skb_wait_for_more_packets().
      
      Fixes: 2b5cd0df ("net: Change return type of sk_busy_loop from bool to void")
      Reported-and-tested-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88c64f9c
    • Dmitry Bogdanov's avatar
      net: aquantia: fix rx checksum offload for UDP/TCP over IPv6 · e4ff39e1
      Dmitry Bogdanov authored
      [ Upstream commit a7faaa0c ]
      
      TCP/UDP checksum validity was propagated to skb
      only if IP checksum is valid.
      But for IPv6 there is no validity as there is no checksum in IPv6.
      This patch propagates TCP/UDP checksum validity regardless of IP checksum.
      
      Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarNikita Danilov <nikita.danilov@aquantia.com>
      Signed-off-by: default avatarDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4ff39e1
    • Bjorn Helgaas's avatar
      mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S · c4084262
      Bjorn Helgaas authored
      [ Upstream commit fae846e2 ]
      
      The device ID alone does not uniquely identify a device.  Test both the
      vendor and device ID to make sure we don't mistakenly think some other
      vendor's 0xB410 device is a Digium HFC4S.  Also, instead of the bare hex
      ID, use the same constant (PCI_DEVICE_ID_DIGIUM_HFC4S) used in the device
      ID table.
      
      No functional change intended.
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4084262
    • Finn Thain's avatar
      mac8390: Fix mmio access size probe · e0f8c06f
      Finn Thain authored
      [ Upstream commit bb9e5c5b ]
      
      The bug that Stan reported is as follows. After a restart, a 16-bit NIC
      may be incorrectly identified as a 32-bit NIC and stop working.
      
      mac8390 slot.E: Memory length resource not found, probing
      mac8390 slot.E: Farallon EtherMac II-C (type farallon)
      mac8390 slot.E: MAC 00:00:c5:30:c2:99, IRQ 61, 32 KB shared memory at 0xfeed0000, 32-bit access.
      
      The bug never arises after a cold start and only intermittently after a
      warm start. (I didn't investigate why the bug is intermittent.)
      
      It turns out that memcpy_toio() is deprecated and memcmp_withio() also
      has issues. Replacing these calls with mmio accessors fixes the problem.
      Reported-and-tested-by: default avatarStan Johnson <userm57@yahoo.com>
      Fixes: 2964db0f ("m68k: Mac DP8390 update")
      Signed-off-by: default avatarFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0f8c06f
    • Xin Long's avatar
      ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL · be092113
      Xin Long authored
      [ Upstream commit 1c87e79a ]
      
      Jianlin reported a crash:
      
        [  381.484332] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
        [  381.619802] RIP: 0010:fib6_rule_lookup+0xa3/0x160
        [  382.009615] Call Trace:
        [  382.020762]  <IRQ>
        [  382.030174]  ip6_route_redirect.isra.52+0xc9/0xf0
        [  382.050984]  ip6_redirect+0xb6/0xf0
        [  382.066731]  icmpv6_notify+0xca/0x190
        [  382.083185]  ndisc_redirect_rcv+0x10f/0x160
        [  382.102569]  ndisc_rcv+0xfb/0x100
        [  382.117725]  icmpv6_rcv+0x3f2/0x520
        [  382.133637]  ip6_input_finish+0xbf/0x460
        [  382.151634]  ip6_input+0x3b/0xb0
        [  382.166097]  ipv6_rcv+0x378/0x4e0
      
      It was caused by the lookup function __ip6_route_redirect() returns NULL in
      fib6_rule_lookup() when ip6_create_rt_rcu() returns NULL.
      
      So we fix it by simply making ip6_create_rt_rcu() return ip6_null_entry
      instead of NULL.
      
      v1->v2:
        - move down 'fallback:' to make it more readable.
      
      Fixes: e873e4b9 ("ipv6: use fib6_info_hold_safe() when necessary")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be092113
    • Matteo Croce's avatar
      gtp: change NET_UDP_TUNNEL dependency to select · 53adaacb
      Matteo Croce authored
      [ Upstream commit c22da366 ]
      
      Similarly to commit a7603ac1 ("geneve: change NET_UDP_TUNNEL
      dependency to select"), GTP has a dependency on NET_UDP_TUNNEL which
      makes impossible to compile it if no other protocol depending on
      NET_UDP_TUNNEL is selected.
      
      Fix this by changing the depends to a select, and drop NET_IP_TUNNEL from
      the select list, as it already depends on NET_UDP_TUNNEL.
      Signed-off-by: default avatarMatteo Croce <mcroce@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53adaacb
    • YueHaibing's avatar
      genetlink: Fix a memory leak on error path · 9b8ef421
      YueHaibing authored
      [ Upstream commit ceabee6c ]
      
      In genl_register_family(), when idr_alloc() fails,
      we forget to free the memory we possibly allocate for
      family->attrbuf.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: 2ae0f17d ("genetlink: use idr to track families")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b8ef421
    • Eric Dumazet's avatar
      dccp: do not use ipv6 header for ipv4 flow · 321461f2
      Eric Dumazet authored
      [ Upstream commit e0aa6770 ]
      
      When a dual stack dccp listener accepts an ipv4 flow,
      it should not attempt to use an ipv6 header or
      inet6_iif() helper.
      
      Fixes: 3df80d93 ("[DCCP]: Introduce DCCPv6")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      321461f2
    • Corey Minyard's avatar
      ipmi_si: Fix crash when using hard-coded device · 6bba17f6
      Corey Minyard authored
      Backport from 41b766d6
      
      When excuting a command like:
        modprobe ipmi_si ports=0xffc0e3 type=bt
      The system would get an oops.
      
      The trouble here is that ipmi_si_hardcode_find_bmc() is called before
      ipmi_si_platform_init(), but initialization of the hard-coded device
      creates an IPMI platform device, which won't be initialized yet.
      
      The real trouble is that hard-coded devices aren't created with
      any device, and the fixup is done later.  So do it right, create the
      hard-coded devices as normal platform devices.
      
      This required adding some new resource types to the IPMI platform
      code for passing information required by the hard-coded device
      and adding some code to remove the hard-coded platform devices
      on module removal.
      
      To enforce the "hard-coded devices passed by the user take priority
      over firmware devices" rule, some special code was added to check
      and see if a hard-coded device already exists.
      
      The backport required some minor fixups and adding the device
      id table that had been added in another change and was used
      in this one.
      Reported-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Tested-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bba17f6
    • Marcel Holtmann's avatar
      Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer · 15d6538a
      Marcel Holtmann authored
      commit 7c9cbd0b upstream.
      
      The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len
      as length value. The opt->len however is in control over the remote user
      and can be used by an attacker to gain access beyond the bounds of the
      actual packet.
      
      To prevent any potential leak of heap memory, it is enough to check that
      the resulting len calculation after calling l2cap_get_conf_opt is not
      below zero. A well formed packet will always return >= 0 here and will
      end with the length value being zero after the last option has been
      parsed. In case of malformed packets messing with the opt->len field the
      length value will become negative. If that is the case, then just abort
      and ignore the option.
      
      In case an attacker uses a too short opt->len value, then garbage will
      be parsed, but that is protected by the unknown option handling and also
      the option parameter size checks.
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15d6538a
    • Marcel Holtmann's avatar
      Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt · 2318c0e4
      Marcel Holtmann authored
      commit af3d5d1c upstream.
      
      When doing option parsing for standard type values of 1, 2 or 4 octets,
      the value is converted directly into a variable instead of a pointer. To
      avoid being tricked into being a pointer, check that for these option
      types that sizes actually match. In L2CAP every option is fixed size and
      thus it is prudent anyway to ensure that the remote side sends us the
      right option size along with option paramters.
      
      If the option size is not matching the option type, then that option is
      silently ignored. It is a protocol violation and instead of trying to
      give the remote attacker any further hints just pretend that option is
      not present and proceed with the default values. Implementation
      following the specification and its qualification procedures will always
      use the correct size and thus not being impacted here.
      
      To keep the code readable and consistent accross all options, a few
      cosmetic changes were also required.
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2318c0e4
  2. 27 Mar, 2019 19 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.32 · 3a2156c8
      Greg Kroah-Hartman authored
      3a2156c8
    • Baolin Wang's avatar
      power: supply: charger-manager: Fix incorrect return value · 33bd347f
      Baolin Wang authored
      commit f25a646f upstream.
      
      Fix incorrect return value.
      Signed-off-by: default avatarBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: default avatarSebastian Reichel <sebastian.reichel@collabora.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33bd347f
    • Hui Wang's avatar
      ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec · 19184190
      Hui Wang authored
      commit b5a236c1 upstream.
      
      Recently we found the audio jack detection stop working after suspend
      on many machines with Realtek codec. Sometimes the audio selection
      dialogue didn't show up after users plugged headhphone/headset into
      the headset jack, sometimes after uses plugged headphone/headset, then
      click the sound icon on the upper-right corner of gnome-desktop, it
      also showed the speaker rather than the headphone.
      
      The root cause is that before suspend, the codec already call the
      runtime_suspend since this codec is not used by any apps, then in
      resume, it will not call runtime_resume for this codec. But for some
      realtek codec (so far, alc236, alc255 and alc891) with the specific
      BIOS, if it doesn't run runtime_resume after suspend, all codec
      functions including jack detection stop working anymore.
      
      This problem existed for a long time, but it was not exposed, that is
      because when problem happens, if users play sound or open
      sound-setting to check audio device, this will trigger calling to
      runtime_resume (via snd_hda_power_up), then the codec starts working
      again before users notice this problem.
      
      Since we don't know how many codec and BIOS combinations have this
      problem, to fix it, let the driver call runtime_resume for all codecs
      in pm_resume, maybe for some codecs, this is not needed, but it is
      harmless. After a codec is runtime resumed, if it is not used by any
      apps, it will be runtime suspended soon and furthermore we don't run
      suspend frequently, this change will not add much power consumption.
      
      Fixes: cc72da7d ("ALSA: hda - Use standard runtime PM for codec power-save control")
      Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19184190
    • Takashi Iwai's avatar
      ALSA: hda - Record the current power state before suspend/resume calls · 156ba57f
      Takashi Iwai authored
      commit 98081ca6 upstream.
      
      Currently we deal with single codec and suspend codec callbacks for
      all S3, S4 and runtime PM handling.  But it turned out that we want
      distinguish the call patterns sometimes, e.g. for applying some init
      sequence only at probing and restoring from hibernate.
      
      This patch slightly modifies the common PM callbacks for HD-audio
      codec and stores the currently processed PM event in power_state of
      the codec's device.power field, which is currently unused.  The codec
      callback can take a look at this event value and judges which purpose
      it's being called.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      156ba57f
    • Waiman Long's avatar
      locking/lockdep: Add debug_locks check in __lock_downgrade() · 0e0f7b30
      Waiman Long authored
      commit 71492580 upstream.
      
      Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
      warning right after a previous lockdep warning. It is likely that the
      previous warning turned off lock debugging causing the lockdep to have
      inconsistency states leading to the lock downgrade warning.
      
      Fix that by add a check for debug_locks at the beginning of
      __lock_downgrade().
      Debugged-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Reported-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e0f7b30
    • Jann Horn's avatar
      x86/unwind: Add hardcoded ORC entry for NULL · 206a76a6
      Jann Horn authored
      commit ac5ceccc upstream.
      
      When the ORC unwinder is invoked for an oops caused by IP==0,
      it currently has no idea what to do because there is no debug information
      for the stack frame of NULL.
      
      But if RIP is NULL, it is very likely that the last successfully executed
      instruction was an indirect CALL/JMP, and it is possible to unwind out in
      the same way as for the first instruction of a normal function. Hardcode
      a corresponding ORC entry.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? __x64_sys_prctl+0x402/0x680
       ? __ia32_sys_prctl+0x6e0/0x6e0
       ? __do_page_fault+0x457/0x620
       ? do_syscall_64+0x6d/0x160
       ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace looks like this:
      
      Call Trace:
       __x64_sys_prctl+0x402/0x680
       ? __ia32_sys_prctl+0x6e0/0x6e0
       ? __do_page_fault+0x457/0x620
       do_syscall_64+0x6d/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      prctl_set_seccomp() still doesn't show up in the trace because for some
      reason, tail call optimization is only disabled in builds that use the
      frame pointer unwinder.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-2-jannh@google.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      206a76a6
    • Jann Horn's avatar
      x86/unwind: Handle NULL pointer calls better in frame unwinder · 367ccafb
      Jann Horn authored
      commit f4f34e1b upstream.
      
      When the frame unwinder is invoked for an oops caused by a call to NULL, it
      currently skips the parent function because BP still points to the parent's
      stack frame; the (nonexistent) current function only has the first half of
      a stack frame, and BP doesn't point to it yet.
      
      Add a special case for IP==0 that calculates a fake BP from SP, then uses
      the real BP for the next frame.
      
      Note that this handles first_frame specially: Return information about the
      parent function as long as the saved IP is >=first_frame, even if the fake
      BP points below it.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace is:
      
      Call Trace:
       prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-1-jannh@google.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      367ccafb
    • Dongli Zhang's avatar
      loop: access lo_backing_file only when the loop device is Lo_bound · 3254dd30
      Dongli Zhang authored
      commit f7c8a412 upstream.
      
      Commit 758a58d0 ("loop: set GENHD_FL_NO_PART_SCAN after
      blkdev_reread_part()") separates "lo->lo_backing_file = NULL" and
      "lo->lo_state = Lo_unbound" into different critical regions protected by
      loop_ctl_mutex.
      
      However, there is below race that the NULL lo->lo_backing_file would be
      accessed when the backend of a loop is another loop device, e.g., loop0's
      backend is a file, while loop1's backend is loop0.
      
      loop0's backend is file            loop1's backend is loop0
      
      __loop_clr_fd()
        mutex_lock(&loop_ctl_mutex);
        lo->lo_backing_file = NULL; --> set to NULL
        mutex_unlock(&loop_ctl_mutex);
                                         loop_set_fd()
                                           mutex_lock_killable(&loop_ctl_mutex);
                                           loop_validate_file()
                                             f = l->lo_backing_file; --> NULL
                                               access if loop0 is not Lo_unbound
        mutex_lock(&loop_ctl_mutex);
        lo->lo_state = Lo_unbound;
        mutex_unlock(&loop_ctl_mutex);
      
      lo->lo_backing_file should be accessed only when the loop device is
      Lo_bound.
      
      In fact, the problem has been introduced already in commit 7ccd0791
      ("loop: Push loop_ctl_mutex down into loop_clr_fd()") after which
      loop_validate_file() could see devices in Lo_rundown state with which it
      did not count. It was harmless at that point but still.
      
      Fixes: 7ccd0791 ("loop: Push loop_ctl_mutex down into loop_clr_fd()")
      Reported-by: syzbot+9bdc1adc1c55e7fe765b@syzkaller.appspotmail.com
      Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3254dd30
    • Florian Westphal's avatar
      netfilter: ebtables: remove BUGPRINT messages · 35cdcdc5
      Florian Westphal authored
      commit d824548d upstream.
      
      They are however frequently triggered by syzkaller, so remove them.
      
      ebtables userspace should never trigger any of these, so there is little
      value in making them pr_debug (or ratelimited).
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35cdcdc5
    • Chao Yu's avatar
      f2fs: fix to avoid deadlock of atomic file operations · 1fd916e8
      Chao Yu authored
      commit 48432984 upstream.
      
      Thread A				Thread B
      - __fput
       - f2fs_release_file
        - drop_inmem_pages
         - mutex_lock(&fi->inmem_lock)
         - __revoke_inmem_pages
          - lock_page(page)
      					- open
      					- f2fs_setattr
      					- truncate_setsize
      					 - truncate_inode_pages_range
      					  - lock_page(page)
      					  - truncate_cleanup_page
      					   - f2fs_invalidate_page
      					    - drop_inmem_page
      					    - mutex_lock(&fi->inmem_lock);
      
      We may encounter above ABBA deadlock as reported by Kyungtae Kim:
      
      I'm reporting a bug in linux-4.17.19: "INFO: task hung in
      drop_inmem_page" (no reproducer)
      
      I think this might be somehow related to the following:
      https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ
      
      =========================================
      INFO: task syz-executor7:10822 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D27024 10822   6346 0x00000004
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
       __mutex_lock_common kernel/locking/mutex.c:833 [inline]
       __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
       mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
       drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
       f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
       do_invalidatepage mm/truncate.c:165 [inline]
       truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
       truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
       truncate_inode_pages mm/truncate.c:478 [inline]
       truncate_pagecache+0x6d/0x90 mm/truncate.c:801
       truncate_setsize+0x81/0xa0 mm/truncate.c:826
       f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
       notify_change+0xa62/0xe80 fs/attr.c:313
       do_truncate+0x12e/0x1e0 fs/open.c:63
       do_last fs/namei.c:2955 [inline]
       path_openat+0x2042/0x29f0 fs/namei.c:3505
       do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
       do_sys_open+0x35e/0x4e0 fs/open.c:1101
       __do_sys_open fs/open.c:1119 [inline]
       __se_sys_open fs/open.c:1114 [inline]
       __x64_sys_open+0x89/0xc0 fs/open.c:1114
       do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
      RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
      RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
      RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
      INFO: task syz-executor7:10858 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D28880 10858   6346 0x00000004
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
       rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
       call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
       __down_write arch/x86/include/asm/rwsem.h:142 [inline]
       down_write+0x58/0xa0 kernel/locking/rwsem.c:72
       inode_lock include/linux/fs.h:713 [inline]
       do_truncate+0x120/0x1e0 fs/open.c:61
       do_last fs/namei.c:2955 [inline]
       path_openat+0x2042/0x29f0 fs/namei.c:3505
       do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
       do_sys_open+0x35e/0x4e0 fs/open.c:1101
       __do_sys_open fs/open.c:1119 [inline]
       __se_sys_open fs/open.c:1114 [inline]
       __x64_sys_open+0x89/0xc0 fs/open.c:1114
       do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
      RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
      RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
      RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
      INFO: task syz-executor5:10829 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor5   D28760 10829   6308 0x80000002
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       io_schedule+0x21/0x80 kernel/sched/core.c:5179
       wait_on_page_bit_common mm/filemap.c:1100 [inline]
       __lock_page+0x2b5/0x390 mm/filemap.c:1273
       lock_page include/linux/pagemap.h:483 [inline]
       __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
       drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
       f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
       __fput+0x2c7/0x780 fs/file_table.c:209
       ____fput+0x1a/0x20 fs/file_table.c:243
       task_work_run+0x151/0x1d0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x8ba/0x30a0 kernel/exit.c:865
       do_group_exit+0x13b/0x3a0 kernel/exit.c:968
       get_signal+0x6bb/0x1650 kernel/signal.c:2482
       do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
       exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
       prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
       do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
      RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
      RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700
      
      This patch tries to use trylock_page to mitigate such deadlock condition
      for fix.
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fd916e8
    • Myungho Jung's avatar
      RDMA/cma: Rollback source IP address if failing to acquire device · 9dd5053c
      Myungho Jung authored
      commit 5fc01fb8 upstream.
      
      If cma_acquire_dev_by_src_ip() returns error in addr_handler(), the
      device state changes back to RDMA_CM_ADDR_BOUND but the resolved source
      IP address is still left. After that, if rdma_destroy_id() is called
      after rdma_listen(), the device is freed without removed from
      listen_any_list in cma_cancel_operation(). Revert to the previous IP
      address if acquiring device fails.
      
      Reported-by: syzbot+f3ce716af730c8f96637@syzkaller.appspotmail.com
      Signed-off-by: default avatarMyungho Jung <mhjungk@gmail.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dd5053c
    • Chris Wilson's avatar
      drm: Reorder set_property_atomic to avoid returning with an active ww_ctx · 015b828b
      Chris Wilson authored
      commit 227ad6d9 upstream.
      
      Delay the drm_modeset_acquire_init() until after we check for an
      allocation failure so that we can return immediately upon error without
      having to unwind.
      
      WARNING: lock held when returning to user space!
      4.20.0+ #174 Not tainted
      ------------------------------------------------
      syz-executor556/8153 is leaving the kernel with locks still held!
      1 lock held by syz-executor556/8153:
        #0: 000000005100c85c (crtc_ww_class_acquire){+.+.}, at:
      set_property_atomic+0xb3/0x330 drivers/gpu/drm/drm_mode_object.c:462
      
      Reported-by: syzbot+6ea337c427f5083ebdf2@syzkaller.appspotmail.com
      Fixes: 144a7999 ("drm: Handle properties in the core for atomic drivers")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Sean Paul <sean@poorly.run>
      Cc: David Airlie <airlied@linux.ie>
      Cc: <stable@vger.kernel.org> # v4.14+
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181230122842.21917-1-chris@chris-wilson.co.ukSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      015b828b
    • Kefeng Wang's avatar
      Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto() · e365b940
      Kefeng Wang authored
      commit 56897b21 upstream.
      
      task A:                                task B:
      hci_uart_set_proto                     flush_to_ldisc
       - p->open(hu) -> h5_open  //alloc h5  - receive_buf
       - set_bit HCI_UART_PROTO_READY         - tty_port_default_receive_buf
       - hci_uart_register_dev                 - tty_ldisc_receive_buf
                                                - hci_uart_tty_receive
      				           - test_bit HCI_UART_PROTO_READY
      				            - h5_recv
       - clear_bit HCI_UART_PROTO_READY             while() {
       - p->open(hu) -> h5_close //free h5
      				              - h5_rx_3wire_hdr
      				               - h5_reset()  //use-after-free
                                                    }
      
      It could use ioctl to set hci uart proto, but there is
      a use-after-free issue when hci_uart_register_dev() fail in
      hci_uart_set_proto(), see stack above, fix this by setting
      HCI_UART_PROTO_READY bit only when hci_uart_register_dev()
      return success.
      
      Reported-by: syzbot+899a33dc0fa0dbaf06a6@syzkaller.appspotmail.com
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarJeremy Cline <jcline@redhat.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e365b940
    • Jeremy Cline's avatar
      Bluetooth: hci_ldisc: Initialize hci_dev before open() · f67202f7
      Jeremy Cline authored
      commit 32a7b4cb upstream.
      
      The hci_dev struct hdev is referenced in work queues and timers started
      by open() in some protocols. This creates a race between the
      initialization function and the work or timer which can result hdev
      being dereferenced while it is still null.
      
      The syzbot report contains a reliable reproducer which causes a null
      pointer dereference of hdev in hci_uart_write_work() by making the
      memory allocation for hdev fail.
      
      To fix this, ensure hdev is valid from before calling a protocol's
      open() until after calling a protocol's close().
      
      Reported-by: syzbot+257790c15bcdef6fe00c@syzkaller.appspotmail.com
      Signed-off-by: default avatarJeremy Cline <jcline@redhat.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f67202f7
    • Myungho Jung's avatar
      Bluetooth: Fix decrementing reference count twice in releasing socket · 4b390513
      Myungho Jung authored
      commit e20a2e9c upstream.
      
      When releasing socket, it is possible to enter hci_sock_release() and
      hci_sock_dev_event(HCI_DEV_UNREG) at the same time in different thread.
      The reference count of hdev should be decremented only once from one of
      them but if storing hdev to local variable in hci_sock_release() before
      detached from socket and setting to NULL in hci_sock_dev_event(),
      hci_dev_put(hdev) is unexpectedly called twice. This is resolved by
      referencing hdev from socket after bt_sock_unlink() in
      hci_sock_release().
      
      Reported-by: syzbot+fdc00003f4efff43bc5b@syzkaller.appspotmail.com
      Signed-off-by: default avatarMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b390513
    • Myungho Jung's avatar
      Bluetooth: hci_uart: Check if socket buffer is ERR_PTR in h4_recv_buf() · 4e0ca4bf
      Myungho Jung authored
      commit 1dc2d785 upstream.
      
      h4_recv_buf() callers store the return value to socket buffer and
      recursively pass the buffer to h4_recv_buf() without protection. So,
      ERR_PTR returned from h4_recv_buf() can be dereferenced, if called again
      before setting the socket buffer to NULL from previous error. Check if
      skb is ERR_PTR in h4_recv_buf().
      
      Reported-by: syzbot+017a32f149406df32703@syzkaller.appspotmail.com
      Signed-off-by: default avatarMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e0ca4bf
    • Hans Verkuil's avatar
      media: v4l2-ctrls.c/uvc: zero v4l2_event · 6bef442e
      Hans Verkuil authored
      commit f45f3f75 upstream.
      
      Control events can leak kernel memory since they do not fully zero the
      event. The same code is present in both v4l2-ctrls.c and uvc_ctrl.c, so
      fix both.
      
      It appears that all other event code is properly zeroing the structure,
      it's these two places.
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Reported-by: syzbot+4f021cf3697781dbd9fb@syzkaller.appspotmail.com
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bef442e
    • zhangyi (F)'s avatar
      ext4: brelse all indirect buffer in ext4_ind_remove_space() · d12d8641
      zhangyi (F) authored
      commit 674a2b27 upstream.
      
      All indirect buffers get by ext4_find_shared() should be released no
      mater the branch should be freed or not. But now, we forget to release
      the lower depth indirect buffers when removing space from the same
      higher depth indirect block. It will lead to buffer leak and futher
      more, it may lead to quota information corruption when using old quota,
      consider the following case.
      
       - Create and mount an empty ext4 filesystem without extent and quota
         features,
       - quotacheck and enable the user & group quota,
       - Create some files and write some data to them, and then punch hole
         to some files of them, it may trigger the buffer leak problem
         mentioned above.
       - Disable quota and run quotacheck again, it will create two new
         aquota files and write the checked quota information to them, which
         probably may reuse the freed indirect block(the buffer and page
         cache was not freed) as data block.
       - Enable quota again, it will invoke
         vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
         buffers and pagecache. Unfortunately, because of the buffer of quota
         data block is still referenced, quota code cannot read the up to date
         quota info from the device and lead to quota information corruption.
      
      This problem can be reproduced by xfstests generic/231 on ext3 file
      system or ext4 file system without extent and quota features.
      
      This patch fix this problem by releasing the missing indirect buffers,
      in ext4_ind_remove_space().
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d12d8641
    • Lukas Czerner's avatar
      ext4: fix data corruption caused by unaligned direct AIO · 76c9ee6b
      Lukas Czerner authored
      commit 372a03e0 upstream.
      
      Ext4 needs to serialize unaligned direct AIO because the zeroing of
      partial blocks of two competing unaligned AIOs can result in data
      corruption.
      
      However it decides not to serialize if the potentially unaligned aio is
      past i_size with the rationale that no pending writes are possible past
      i_size. Unfortunately if the i_size is not block aligned and the second
      unaligned write lands past i_size, but still into the same block, it has
      the potential of corrupting the previous unaligned write to the same
      block.
      
      This is (very simplified) reproducer from Frank
      
          // 41472 = (10 * 4096) + 512
          // 37376 = 41472 - 4096
      
          ftruncate(fd, 41472);
          io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
          io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
      
          io_submit(io_ctx, 1, &iocbs[1]);
          io_submit(io_ctx, 1, &iocbs[2]);
      
          io_getevents(io_ctx, 2, 2, events, NULL);
      
      Without this patch the 512B range from 40960 up to the start of the
      second unaligned write (41472) is going to be zeroed overwriting the data
      written by the first write. This is a data corruption.
      
      00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
      *
      0000a000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
      
      With this patch the data corruption is avoided because we will recognize
      the unaligned_aio and wait for the unwritten extent conversion.
      
      00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
      *
      0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
      *
      0000b200
      Reported-by: default avatarFrank Sorenson <fsorenso@redhat.com>
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Fixes: e9e3bcec ("ext4: serialize unaligned asynchronous DIO")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76c9ee6b