1. 12 Jan, 2020 28 commits
  2. 09 Jan, 2020 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.94 · cb1f9a16
      Greg Kroah-Hartman authored
      cb1f9a16
    • Alexander Shishkin's avatar
      perf/x86/intel/bts: Fix the use of page_private() · 78880475
      Alexander Shishkin authored
      [ Upstream commit ff61541c ]
      
      Commit
      
        8062382c ("perf/x86/intel/bts: Add BTS PMU driver")
      
      brought in a warning with the BTS buffer initialization
      that is easily tripped with (assuming KPTI is disabled):
      
      instantly throwing:
      
      > ------------[ cut here ]------------
      > WARNING: CPU: 2 PID: 326 at arch/x86/events/intel/bts.c:86 bts_buffer_setup_aux+0x117/0x3d0
      > Modules linked in:
      > CPU: 2 PID: 326 Comm: perf Not tainted 5.4.0-rc8-00291-gceb9e773 #904
      > RIP: 0010:bts_buffer_setup_aux+0x117/0x3d0
      > Call Trace:
      >  rb_alloc_aux+0x339/0x550
      >  perf_mmap+0x607/0xc70
      >  mmap_region+0x76b/0xbd0
      ...
      
      It appears to assume (for lost raisins) that PagePrivate() is set,
      while later it actually tests for PagePrivate() before using
      page_private().
      
      Make it consistent and always check PagePrivate() before using
      page_private().
      
      Fixes: 8062382c ("perf/x86/intel/bts: Add BTS PMU driver")
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Link: https://lkml.kernel.org/r/20191205142853.28894-2-alexander.shishkin@linux.intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      78880475
    • SeongJae Park's avatar
      xen/blkback: Avoid unmapping unmapped grant pages · 87d43527
      SeongJae Park authored
      [ Upstream commit f9bd84a8 ]
      
      For each I/O request, blkback first maps the foreign pages for the
      request to its local pages.  If an allocation of a local page for the
      mapping fails, it should unmap every mapping already made for the
      request.
      
      However, blkback's handling mechanism for the allocation failure does
      not mark the remaining foreign pages as unmapped.  Therefore, the unmap
      function merely tries to unmap every valid grant page for the request,
      including the pages not mapped due to the allocation failure.  On a
      system that fails the allocation frequently, this problem leads to
      following kernel crash.
      
        [  372.012538] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
        [  372.012546] IP: [<ffffffff814071ac>] gnttab_unmap_refs.part.7+0x1c/0x40
        [  372.012557] PGD 16f3e9067 PUD 16426e067 PMD 0
        [  372.012562] Oops: 0002 [#1] SMP
        [  372.012566] Modules linked in: act_police sch_ingress cls_u32
        ...
        [  372.012746] Call Trace:
        [  372.012752]  [<ffffffff81407204>] gnttab_unmap_refs+0x34/0x40
        [  372.012759]  [<ffffffffa0335ae3>] xen_blkbk_unmap+0x83/0x150 [xen_blkback]
        ...
        [  372.012802]  [<ffffffffa0336c50>] dispatch_rw_block_io+0x970/0x980 [xen_blkback]
        ...
        Decompressing Linux... Parsing ELF... done.
        Booting the kernel.
        [    0.000000] Initializing cgroup subsys cpuset
      
      This commit fixes this problem by marking the grant pages of the given
      request that didn't mapped due to the allocation failure as invalid.
      
      Fixes: c6cc142d ("xen-blkback: use balloon pages for all mappings")
      Reviewed-by: default avatarDavid Woodhouse <dwmw@amazon.de>
      Reviewed-by: default avatarMaximilian Heyne <mheyne@amazon.de>
      Reviewed-by: default avatarPaul Durrant <pdurrant@amazon.co.uk>
      Reviewed-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarSeongJae Park <sjpark@amazon.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      87d43527
    • Heiko Carstens's avatar
      s390/smp: fix physical to logical CPU map for SMT · a5011c78
      Heiko Carstens authored
      [ Upstream commit 72a81ad9 ]
      
      If an SMT capable system is not IPL'ed from the first CPU the setup of
      the physical to logical CPU mapping is broken: the IPL core gets CPU
      number 0, but then the next core gets CPU number 1. Correct would be
      that all SMT threads of CPU 0 get the subsequent logical CPU numbers.
      
      This is important since a lot of code (like e.g. the CPU topology
      code) assumes that CPU maps are setup like this. If the mapping is
      broken the system will not IPL due to broken topology masks:
      
      [    1.716341] BUG: arch topology broken
      [    1.716342]      the SMT domain not a subset of the MC domain
      [    1.716343] BUG: arch topology broken
      [    1.716344]      the MC domain not a subset of the BOOK domain
      
      This scenario can usually not happen since LPARs are always IPL'ed
      from CPU 0 and also re-IPL is intiated from CPU 0. However older
      kernels did initiate re-IPL on an arbitrary CPU. If therefore a re-IPL
      from an old kernel into a new kernel is initiated this may lead to
      crash.
      
      Fix this by setting up the physical to logical CPU mapping correctly.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a5011c78
    • Zhihao Cheng's avatar
      ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps · 7764ed0b
      Zhihao Cheng authored
      [ Upstream commit 6abf5726 ]
      
      Running stress-test test_2 in mtd-utils on ubi device, sometimes we can
      get following oops message:
      
        BUG: unable to handle page fault for address: ffffffff00000140
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 280a067 P4D 280a067 PUD 0
        Oops: 0000 [#1] SMP
        CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0
        -0-ga698c8995f-prebuilt.qemu.org 04/01/2014
        Workqueue: writeback wb_workfn (flush-ubifs_0_0)
        RIP: 0010:rb_next_postorder+0x2e/0xb0
        Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db
        03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a
        10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03
        RSP: 0018:ffffc90000887758 EFLAGS: 00010202
        RAX: ffff888129ae4700 RBX: ffff888138b08400 RCX: 0000000080800001
        RDX: ffffffff00000130 RSI: 0000000080800024 RDI: ffff888138b08400
        RBP: ffff888138b08400 R08: ffffea0004a6b920 R09: 0000000000000000
        R10: ffffc90000887740 R11: 0000000000000001 R12: ffff888128d48000
        R13: 0000000000000800 R14: 000000000000011e R15: 00000000000007c8
        FS:  0000000000000000(0000) GS:ffff88813ba00000(0000)
        knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffffffff00000140 CR3: 000000013789d000 CR4: 00000000000006f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
          destroy_old_idx+0x5d/0xa0 [ubifs]
          ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs]
          do_commit+0x3eb/0x830 [ubifs]
          ubifs_run_commit+0xdc/0x1c0 [ubifs]
      
      Above Oops are due to the slab-out-of-bounds happened in do-while of
      function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In
      function layout_in_gaps, there is a do-while loop placing index nodes
      into the gaps created by obsolete index nodes in non-empty index LEBs
      until rest index nodes can totally be placed into pre-allocated empty
      LEBs. @c->gap_lebs points to a memory area(integer array) which records
      LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB
      is found, corresponding lnum will be incrementally written into the
      memory area pointed by @c->gap_lebs. The size
      ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before
      do-while loop and can not be changed in the loop. But @c->lst.idx_lebs
      could be increased by function ubifs_change_lp (called by
      layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the
      loop. So, sometimes oob happens when number of cycles in do-while loop
      exceeds the original value of @c->lst.idx_lebs. See detail in
      https://bugzilla.kernel.org/show_bug.cgi?id=204229.
      This patch fixes oob in layout_in_gaps.
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7764ed0b
    • Eric Dumazet's avatar
      net: add annotations on hh->hh_len lockless accesses · bc5fc4a6
      Eric Dumazet authored
      [ Upstream commit c305c6ae ]
      
      KCSAN reported a data-race [1]
      
      While we can use READ_ONCE() on the read sides,
      we need to make sure hh->hh_len is written last.
      
      [1]
      
      BUG: KCSAN: data-race in eth_header_cache / neigh_resolve_output
      
      write to 0xffff8880b9dedcb8 of 4 bytes by task 29760 on cpu 0:
       eth_header_cache+0xa9/0xd0 net/ethernet/eth.c:247
       neigh_hh_init net/core/neighbour.c:1463 [inline]
       neigh_resolve_output net/core/neighbour.c:1480 [inline]
       neigh_resolve_output+0x415/0x470 net/core/neighbour.c:1470
       neigh_output include/net/neighbour.h:511 [inline]
       ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       NF_HOOK include/linux/netfilter.h:305 [inline]
       ndisc_send_skb+0x459/0x5f0 net/ipv6/ndisc.c:505
       ndisc_send_ns+0x207/0x430 net/ipv6/ndisc.c:647
       rt6_probe_deferred+0x98/0xf0 net/ipv6/route.c:615
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      read to 0xffff8880b9dedcb8 of 4 bytes by task 29572 on cpu 1:
       neigh_resolve_output net/core/neighbour.c:1479 [inline]
       neigh_resolve_output+0x113/0x470 net/core/neighbour.c:1470
       neigh_output include/net/neighbour.h:511 [inline]
       ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       NF_HOOK include/linux/netfilter.h:305 [inline]
       ndisc_send_skb+0x459/0x5f0 net/ipv6/ndisc.c:505
       ndisc_send_ns+0x207/0x430 net/ipv6/ndisc.c:647
       rt6_probe_deferred+0x98/0xf0 net/ipv6/route.c:615
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 29572 Comm: kworker/1:4 Not tainted 5.4.0-rc6+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events rt6_probe_deferred
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bc5fc4a6
    • Darrick J. Wong's avatar
      xfs: periodically yield scrub threads to the scheduler · 58a46618
      Darrick J. Wong authored
      [ Upstream commit 5d1116d4 ]
      
      Christoph Hellwig complained about the following soft lockup warning
      when running scrub after generic/175 when preemption is disabled and
      slub debugging is enabled:
      
      watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [xfs_scrub:161]
      Modules linked in:
      irq event stamp: 41692326
      hardirqs last  enabled at (41692325): [<ffffffff8232c3b7>] _raw_0
      hardirqs last disabled at (41692326): [<ffffffff81001c5a>] trace0
      softirqs last  enabled at (41684994): [<ffffffff8260031f>] __do_e
      softirqs last disabled at (41684987): [<ffffffff81127d8c>] irq_e0
      CPU: 3 PID: 16189 Comm: xfs_scrub Not tainted 5.4.0-rc3+ #30
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.124
      RIP: 0010:_raw_spin_unlock_irqrestore+0x39/0x40
      Code: 89 f3 be 01 00 00 00 e8 d5 3a e5 fe 48 89 ef e8 ed 87 e5 f2
      RSP: 0018:ffffc9000233f970 EFLAGS: 00000286 ORIG_RAX: ffffffffff3
      RAX: ffff88813b398040 RBX: 0000000000000286 RCX: 0000000000000006
      RDX: 0000000000000006 RSI: ffff88813b3988c0 RDI: ffff88813b398040
      RBP: ffff888137958640 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00042b0c00
      R13: 0000000000000001 R14: ffff88810ac32308 R15: ffff8881376fc040
      FS:  00007f6113dea700(0000) GS:ffff88813bb80000(0000) knlGS:00000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f6113de8ff8 CR3: 000000012f290000 CR4: 00000000000006e0
      Call Trace:
       free_debug_processing+0x1dd/0x240
       __slab_free+0x231/0x410
       kmem_cache_free+0x30e/0x360
       xchk_ag_btcur_free+0x76/0xb0
       xchk_ag_free+0x10/0x80
       xchk_bmap_iextent_xref.isra.14+0xd9/0x120
       xchk_bmap_iextent+0x187/0x210
       xchk_bmap+0x2e0/0x3b0
       xfs_scrub_metadata+0x2e7/0x500
       xfs_ioc_scrub_metadata+0x4a/0xa0
       xfs_file_ioctl+0x58a/0xcd0
       do_vfs_ioctl+0xa0/0x6f0
       ksys_ioctl+0x5b/0x90
       __x64_sys_ioctl+0x11/0x20
       do_syscall_64+0x4b/0x1a0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      If preemption is disabled, all metadata buffers needed to perform the
      scrub are already in memory, and there are a lot of records to check,
      it's possible that the scrub thread will run for an extended period of
      time without sleeping for IO or any other reason.  Then the watchdog
      timer or the RCU stall timeout can trigger, producing the backtrace
      above.
      
      To fix this problem, call cond_resched() from the scrub thread so that
      we back out to the scheduler whenever necessary.
      Reported-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      58a46618
    • Masashi Honma's avatar
      ath9k_htc: Discard undersized packets · 6dc835db
      Masashi Honma authored
      [ Upstream commit cd486e62 ]
      
      Sometimes the hardware will push small packets that trigger a WARN_ON
      in mac80211. Discard them early to avoid this issue.
      
      This patch ports 2 patches from ath9k to ath9k_htc.
      commit 3c0efb74 "ath9k: discard
      undersized packets".
      commit df5c4150 "ath9k: correctly
      handle short radar pulses".
      
      [  112.835889] ------------[ cut here ]------------
      [  112.835971] WARNING: CPU: 5 PID: 0 at net/mac80211/rx.c:804 ieee80211_rx_napi+0xaac/0xb40 [mac80211]
      [  112.835973] Modules linked in: ath9k_htc ath9k_common ath9k_hw ath mac80211 cfg80211 libarc4 nouveau snd_hda_codec_hdmi intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec video snd_hda_core ttm snd_hwdep drm_kms_helper snd_pcm crct10dif_pclmul snd_seq_midi drm snd_seq_midi_event crc32_pclmul snd_rawmidi ghash_clmulni_intel snd_seq aesni_intel aes_x86_64 crypto_simd cryptd snd_seq_device glue_helper snd_timer sch_fq_codel i2c_algo_bit fb_sys_fops snd input_leds syscopyarea sysfillrect sysimgblt intel_cstate mei_me intel_rapl_perf soundcore mxm_wmi lpc_ich mei kvm_intel kvm mac_hid irqbypass parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear e1000e ahci libahci wmi
      [  112.836022] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.3.0-wt #1
      [  112.836023] Hardware name: MouseComputer Co.,Ltd. X99-S01/X99-S01, BIOS 1.0C-W7 04/01/2015
      [  112.836056] RIP: 0010:ieee80211_rx_napi+0xaac/0xb40 [mac80211]
      [  112.836059] Code: 00 00 66 41 89 86 b0 00 00 00 e9 c8 fa ff ff 4c 89 b5 40 ff ff ff 49 89 c6 e9 c9 fa ff ff 48 c7 c7 e0 a2 a5 c0 e8 47 41 b0 e9 <0f> 0b 48 89 df e8 5a 94 2d ea e9 02 f9 ff ff 41 39 c1 44 89 85 60
      [  112.836060] RSP: 0018:ffffaa6180220da8 EFLAGS: 00010286
      [  112.836062] RAX: 0000000000000024 RBX: ffff909a20eeda00 RCX: 0000000000000000
      [  112.836064] RDX: 0000000000000000 RSI: ffff909a2f957448 RDI: ffff909a2f957448
      [  112.836065] RBP: ffffaa6180220e78 R08: 00000000000006e9 R09: 0000000000000004
      [  112.836066] R10: 000000000000000a R11: 0000000000000001 R12: 0000000000000000
      [  112.836068] R13: ffff909a261a47a0 R14: 0000000000000000 R15: 0000000000000004
      [  112.836070] FS:  0000000000000000(0000) GS:ffff909a2f940000(0000) knlGS:0000000000000000
      [  112.836071] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  112.836073] CR2: 00007f4e3ffffa08 CR3: 00000001afc0a006 CR4: 00000000001606e0
      [  112.836074] Call Trace:
      [  112.836076]  <IRQ>
      [  112.836083]  ? finish_td+0xb3/0xf0
      [  112.836092]  ? ath9k_rx_prepare.isra.11+0x22f/0x2a0 [ath9k_htc]
      [  112.836099]  ath9k_rx_tasklet+0x10b/0x1d0 [ath9k_htc]
      [  112.836105]  tasklet_action_common.isra.22+0x63/0x110
      [  112.836108]  tasklet_action+0x22/0x30
      [  112.836115]  __do_softirq+0xe4/0x2da
      [  112.836118]  irq_exit+0xae/0xb0
      [  112.836121]  do_IRQ+0x86/0xe0
      [  112.836125]  common_interrupt+0xf/0xf
      [  112.836126]  </IRQ>
      [  112.836130] RIP: 0010:cpuidle_enter_state+0xa9/0x440
      [  112.836133] Code: 3d bc 20 38 55 e8 f7 1d 84 ff 49 89 c7 0f 1f 44 00 00 31 ff e8 28 29 84 ff 80 7d d3 00 0f 85 e6 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 ed 0f 89 ff 01 00 00 41 c7 44 24 10 00 00 00 00 48 83 c4 18
      [  112.836134] RSP: 0018:ffffaa61800e3e48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
      [  112.836136] RAX: ffff909a2f96b340 RBX: ffffffffabb58200 RCX: 000000000000001f
      [  112.836137] RDX: 0000001a458adc5d RSI: 0000000026c9b581 RDI: 0000000000000000
      [  112.836139] RBP: ffffaa61800e3e88 R08: 0000000000000002 R09: 000000000002abc0
      [  112.836140] R10: ffffaa61800e3e18 R11: 000000000000002d R12: ffffca617fb40b00
      [  112.836141] R13: 0000000000000002 R14: ffffffffabb582d8 R15: 0000001a458adc5d
      [  112.836145]  ? cpuidle_enter_state+0x98/0x440
      [  112.836149]  ? menu_select+0x370/0x600
      [  112.836151]  cpuidle_enter+0x2e/0x40
      [  112.836154]  call_cpuidle+0x23/0x40
      [  112.836156]  do_idle+0x204/0x280
      [  112.836159]  cpu_startup_entry+0x1d/0x20
      [  112.836164]  start_secondary+0x167/0x1c0
      [  112.836169]  secondary_startup_64+0xa4/0xb0
      [  112.836173] ---[ end trace 9f4cd18479cc5ae5 ]---
      Signed-off-by: default avatarMasashi Honma <masashi.honma@gmail.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6dc835db
    • Masashi Honma's avatar
      ath9k_htc: Modify byte order for an error message · f10bcc6b
      Masashi Honma authored
      [ Upstream commit e01fddc1 ]
      
      rs_datalen is be16 so we need to convert it before printing.
      Signed-off-by: default avatarMasashi Honma <masashi.honma@gmail.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f10bcc6b
    • Taehee Yoo's avatar
      net: core: limit nested device depth · a2e06554
      Taehee Yoo authored
      [ Upstream commit 5343da4c ]
      
      Current code doesn't limit the number of nested devices.
      Nested devices would be handled recursively and this needs huge stack
      memory. So, unlimited nested devices could make stack overflow.
      
      This patch adds upper_level and lower_level, they are common variables
      and represent maximum lower/upper depth.
      When upper/lower device is attached or dettached,
      {lower/upper}_level are updated. and if maximum depth is bigger than 8,
      attach routine fails and returns -EMLINK.
      
      In addition, this patch converts recursive routine of
      netdev_walk_all_{lower/upper} to iterator routine.
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add link dummy0 name vlan1 type vlan id 1
          ip link set vlan1 up
      
          for i in {2..55}
          do
      	    let A=$i-1
      
      	    ip link add vlan$i link vlan$A type vlan id $i
          done
          ip link del dummy0
      
      Splat looks like:
      [  155.513226][  T908] BUG: KASAN: use-after-free in __unwind_start+0x71/0x850
      [  155.514162][  T908] Write of size 88 at addr ffff8880608a6cc0 by task ip/908
      [  155.515048][  T908]
      [  155.515333][  T908] CPU: 0 PID: 908 Comm: ip Not tainted 5.4.0-rc3+ #96
      [  155.516147][  T908] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  155.517233][  T908] Call Trace:
      [  155.517627][  T908]
      [  155.517918][  T908] Allocated by task 0:
      [  155.518412][  T908] (stack is not available)
      [  155.518955][  T908]
      [  155.519228][  T908] Freed by task 0:
      [  155.519885][  T908] (stack is not available)
      [  155.520452][  T908]
      [  155.520729][  T908] The buggy address belongs to the object at ffff8880608a6ac0
      [  155.520729][  T908]  which belongs to the cache names_cache of size 4096
      [  155.522387][  T908] The buggy address is located 512 bytes inside of
      [  155.522387][  T908]  4096-byte region [ffff8880608a6ac0, ffff8880608a7ac0)
      [  155.523920][  T908] The buggy address belongs to the page:
      [  155.524552][  T908] page:ffffea0001822800 refcount:1 mapcount:0 mapping:ffff88806c657cc0 index:0x0 compound_mapcount:0
      [  155.525836][  T908] flags: 0x100000000010200(slab|head)
      [  155.526445][  T908] raw: 0100000000010200 ffffea0001813808 ffffea0001a26c08 ffff88806c657cc0
      [  155.527424][  T908] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
      [  155.528429][  T908] page dumped because: kasan: bad access detected
      [  155.529158][  T908]
      [  155.529410][  T908] Memory state around the buggy address:
      [  155.530060][  T908]  ffff8880608a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  155.530971][  T908]  ffff8880608a6c00: fb fb fb fb fb f1 f1 f1 f1 00 f2 f2 f2 f3 f3 f3
      [  155.531889][  T908] >ffff8880608a6c80: f3 fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  155.532806][  T908]                                            ^
      [  155.533509][  T908]  ffff8880608a6d00: fb fb fb fb fb fb fb fb fb f1 f1 f1 f1 00 00 00
      [  155.534436][  T908]  ffff8880608a6d80: f2 f3 f3 f3 f3 fb fb fb 00 00 00 00 00 00 00 00
      [ ... ]
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a2e06554
    • Eric Dumazet's avatar
      tcp: annotate tp->rcv_nxt lockless reads · 67f028ac
      Eric Dumazet authored
      [ Upstream commit dba7d9b8 ]
      
      There are few places where we fetch tp->rcv_nxt while
      this field can change from IRQ or other cpu.
      
      We need to add READ_ONCE() annotations, and also make
      sure write sides use corresponding WRITE_ONCE() to avoid
      store-tearing.
      
      Note that tcp_inq_hint() was already using READ_ONCE(tp->rcv_nxt)
      
      syzbot reported :
      
      BUG: KCSAN: data-race in tcp_poll / tcp_queue_rcv
      
      write to 0xffff888120425770 of 4 bytes by interrupt on cpu 0:
       tcp_rcv_nxt_update net/ipv4/tcp_input.c:3365 [inline]
       tcp_queue_rcv+0x180/0x380 net/ipv4/tcp_input.c:4638
       tcp_rcv_established+0xbf1/0xf50 net/ipv4/tcp_input.c:5616
       tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1542
       tcp_v4_rcv+0x1a03/0x1bf0 net/ipv4/tcp_ipv4.c:1923
       ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
       netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
       napi_skb_finish net/core/dev.c:5671 [inline]
       napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
       receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
      
      read to 0xffff888120425770 of 4 bytes by task 7254 on cpu 1:
       tcp_stream_is_readable net/ipv4/tcp.c:480 [inline]
       tcp_poll+0x204/0x6b0 net/ipv4/tcp.c:554
       sock_poll+0xed/0x250 net/socket.c:1256
       vfs_poll include/linux/poll.h:90 [inline]
       ep_item_poll.isra.0+0x90/0x190 fs/eventpoll.c:892
       ep_send_events_proc+0x113/0x5c0 fs/eventpoll.c:1749
       ep_scan_ready_list.constprop.0+0x189/0x500 fs/eventpoll.c:704
       ep_send_events fs/eventpoll.c:1793 [inline]
       ep_poll+0xe3/0x900 fs/eventpoll.c:1930
       do_epoll_wait+0x162/0x180 fs/eventpoll.c:2294
       __do_sys_epoll_pwait fs/eventpoll.c:2325 [inline]
       __se_sys_epoll_pwait fs/eventpoll.c:2311 [inline]
       __x64_sys_epoll_pwait+0xcd/0x170 fs/eventpoll.c:2311
       do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 7254 Comm: syz-fuzzer Not tainted 5.3.0+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      67f028ac
    • David Howells's avatar
      rxrpc: Fix possible NULL pointer access in ICMP handling · d9f4d60a
      David Howells authored
      [ Upstream commit f0308fb0 ]
      
      If an ICMP packet comes in on the UDP socket backing an AF_RXRPC socket as
      the UDP socket is being shut down, rxrpc_error_report() may get called to
      deal with it after sk_user_data on the UDP socket has been cleared, leading
      to a NULL pointer access when this local endpoint record gets accessed.
      
      Fix this by just returning immediately if sk_user_data was NULL.
      
      The oops looks like the following:
      
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      ...
      RIP: 0010:rxrpc_error_report+0x1bd/0x6a9
      ...
      Call Trace:
       ? sock_queue_err_skb+0xbd/0xde
       ? __udp4_lib_err+0x313/0x34d
       __udp4_lib_err+0x313/0x34d
       icmp_unreach+0x1ee/0x207
       icmp_rcv+0x25b/0x28f
       ip_protocol_deliver_rcu+0x95/0x10e
       ip_local_deliver+0xe9/0x148
       __netif_receive_skb_one_core+0x52/0x6e
       process_backlog+0xdc/0x177
       net_rx_action+0xf9/0x270
       __do_softirq+0x1b6/0x39a
       ? smpboot_register_percpu_thread+0xce/0xce
       run_ksoftirqd+0x1d/0x42
       smpboot_thread_fn+0x19e/0x1b3
       kthread+0xf1/0xf6
       ? kthread_delayed_work_timer_fn+0x83/0x83
       ret_from_fork+0x24/0x30
      
      Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
      Reported-by: syzbot+611164843bd48cc2190c@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9f4d60a