1. 21 Oct, 2018 3 commits
  2. 20 Oct, 2018 11 commits
  3. 19 Oct, 2018 13 commits
  4. 18 Oct, 2018 13 commits
    • Stefano Brivio's avatar
      ip6_tunnel: Fix encapsulation layout · d4d576f5
      Stefano Brivio authored
      Commit 058214a4 ("ip6_tun: Add infrastructure for doing
      encapsulation") added the ip6_tnl_encap() call in ip6_tnl_xmit(), before
      the call to ipv6_push_frag_opts() to append the IPv6 Tunnel Encapsulation
      Limit option (option 4, RFC 2473, par. 5.1) to the outer IPv6 header.
      
      As long as the option didn't actually end up in generated packets, this
      wasn't an issue. Then commit 89a23c8b ("ip6_tunnel: Fix missing tunnel
      encapsulation limit option") fixed sending of this option, and the
      resulting layout, e.g. for FoU, is:
      
      .-------------------.------------.----------.-------------------.----- - -
      | Outer IPv6 Header | UDP header | Option 4 | Inner IPv6 Header | Payload
      '-------------------'------------'----------'-------------------'----- - -
      
      Needless to say, FoU and GUE (at least) won't work over IPv6. The option
      is appended by default, and I couldn't find a way to disable it with the
      current iproute2.
      
      Turn this into a more reasonable:
      
      .-------------------.----------.------------.-------------------.----- - -
      | Outer IPv6 Header | Option 4 | UDP header | Inner IPv6 Header | Payload
      '-------------------'----------'------------'-------------------'----- - -
      
      With this, and with 84dad559 ("udp6: fix encap return code for
      resubmitting"), FoU and GUE work again over IPv6.
      
      Fixes: 058214a4 ("ip6_tun: Add infrastructure for doing encapsulation")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4d576f5
    • Jon Maloy's avatar
      tipc: fix info leak from kernel tipc_event · b06f9d9f
      Jon Maloy authored
      We initialize a struct tipc_event allocated on the kernel stack to
      zero to avert info leak to user space.
      
      Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b06f9d9f
    • Wenwen Wang's avatar
      net: socket: fix a missing-check bug · b6168562
      Wenwen Wang authored
      In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
      statement to see whether it is necessary to pre-process the ethtool
      structure, because, as mentioned in the comment, the structure
      ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
      is allocated through compat_alloc_user_space(). One thing to note here is
      that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
      partially determined by 'rule_cnt', which is actually acquired from the
      user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
      get_user(). After 'rxnfc' is allocated, the data in the original user-space
      buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
      including the 'rule_cnt' field. However, after this copy, no check is
      re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
      race to change the value in the 'compat_rxnfc->rule_cnt' between these two
      copies. Through this way, the attacker can bypass the previous check on
      'rule_cnt' and inject malicious data. This can cause undefined behavior of
      the kernel and introduce potential security risk.
      
      This patch avoids the above issue via copying the value acquired by
      get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6168562
    • Phil Sutter's avatar
      net: sched: Fix for duplicate class dump · 3c53ed8f
      Phil Sutter authored
      When dumping classes by parent, kernel would return classes twice:
      
      | # tc qdisc add dev lo root prio
      | # tc class show dev lo
      | class prio 8001:1 parent 8001:
      | class prio 8001:2 parent 8001:
      | class prio 8001:3 parent 8001:
      | # tc class show dev lo parent 8001:
      | class prio 8001:1 parent 8001:
      | class prio 8001:2 parent 8001:
      | class prio 8001:3 parent 8001:
      | class prio 8001:1 parent 8001:
      | class prio 8001:2 parent 8001:
      | class prio 8001:3 parent 8001:
      
      This comes from qdisc_match_from_root() potentially returning the root
      qdisc itself if its handle matched. Though in that case, root's classes
      were already dumped a few lines above.
      
      Fixes: cb395b20 ("net: sched: optimize class dumps")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c53ed8f
    • Heiner Kallweit's avatar
      r8169: fix NAPI handling under high load · 6b839b6c
      Heiner Kallweit authored
      rtl_rx() and rtl_tx() are called only if the respective bits are set
      in the interrupt status register. Under high load NAPI may not be
      able to process all data (work_done == budget) and it will schedule
      subsequent calls to the poll callback.
      rtl_ack_events() however resets the bits in the interrupt status
      register, therefore subsequent calls to rtl8169_poll() won't call
      rtl_rx() and rtl_tx() - chip interrupts are still disabled.
      
      Fix this by calling rtl_rx() and rtl_tx() independent of the bits
      set in the interrupt status register. Both functions will detect
      if there's nothing to do for them.
      
      Fixes: da78dbff ("r8169: remove work from irq handler.")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b839b6c
    • David S. Miller's avatar
      sparc: Revert unintended perf changes. · 27faeebd
      David S. Miller authored
      Some local debugging hacks accidently slipped into the VDSO commit.
      
      Sorry!
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27faeebd
    • Leo Li's avatar
      drm: Get ref on CRTC commit object when waiting for flip_done · 4364bcb2
      Leo Li authored
      This fixes a general protection fault, caused by accessing the contents
      of a flip_done completion object that has already been freed. It occurs
      due to the preemption of a non-blocking commit worker thread W by
      another commit thread X. X continues to clear its atomic state at the
      end, destroying the CRTC commit object that W still needs. Switching
      back to W and accessing the commit objects then leads to bad results.
      
      Worker W becomes preemptable when waiting for flip_done to complete. At
      this point, a frequently occurring commit thread X can take over. Here's
      an example where W is a worker thread that flips on both CRTCs, and X
      does a legacy cursor update on both CRTCs:
      
              ...
           1. W does flip work
           2. W runs commit_hw_done()
           3. W waits for flip_done on CRTC 1
           4. > flip_done for CRTC 1 completes
           5. W finishes waiting for CRTC 1
           6. W waits for flip_done on CRTC 2
      
           7. > Preempted by X
           8. > flip_done for CRTC 2 completes
           9. X atomic_check: hw_done and flip_done are complete on all CRTCs
          10. X updates cursor on both CRTCs
          11. X destroys atomic state
          12. X done
      
          13. > Switch back to W
          14. W waits for flip_done on CRTC 2
          15. W raises general protection fault
      
      The error looks like so:
      
          general protection fault: 0000 [#1] PREEMPT SMP PTI
          **snip**
          Call Trace:
           lock_acquire+0xa2/0x1b0
           _raw_spin_lock_irq+0x39/0x70
           wait_for_completion_timeout+0x31/0x130
           drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper]
           amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu]
           commit_tail+0x3d/0x70 [drm_kms_helper]
           process_one_work+0x212/0x650
           worker_thread+0x49/0x420
           kthread+0xfb/0x130
           ret_from_fork+0x3a/0x50
          Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O)
          gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt
          fb_sys_fops ttm(O) drm(O)
      
      Note that i915 has this issue masked, since hw_done is signaled after
      waiting for flip_done. Doing so will block the cursor update from
      happening until hw_done is signaled, preventing the cursor commit from
      destroying the state.
      
      v2: The reference on the commit object needs to be obtained before
          hw_done() is signaled, since that's the point where another commit
          is allowed to modify the state. Assuming that the
          new_crtc_state->commit object still exists within flip_done() is
          incorrect.
      
          Fix by getting a reference in setup_commit(), and releasing it
          during default_clear().
      Signed-off-by: default avatarLeo Li <sunpeng.li@amd.com>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarHarry Wentland <harry.wentland@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1539611200-6184-1-git-send-email-sunpeng.li@amd.com
      4364bcb2
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 2ee653f6
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2018-10-18
      
      1) Free the xfrm interface gro_cells when deleting the
         interface, otherwise we leak it. From Li RongQing.
      
      2) net/core/flow.c does not exist anymore, so remove it
         from the MAINTAINERS file.
      
      3) Fix a slab-out-of-bounds in _decode_session6.
         From Alexei Starovoitov.
      
      4) Fix RCU protection when policies inserted into
         thei bydst lists. From Florian Westphal.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ee653f6
    • Ming Lei's avatar
      block: don't deal with discard limit in blkdev_issue_discard() · 744889b7
      Ming Lei authored
      blk_queue_split() does respect this limit via bio splitting, so no
      need to do that in blkdev_issue_discard(), then we can align to
      normal bio submit(bio_add_page() & submit_bio()).
      
      More importantly, this patch fixes one issue introduced in a22c4d7e
      ("block: re-add discard_granularity and alignment checks"), in which
      zero discard bio may be generated in case of zero alignment.
      
      Fixes: a22c4d7e ("block: re-add discard_granularity and alignment checks")
      Cc: stable@vger.kernel.org
      Cc: Ming Lin <ming.l@ssi.samsung.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Xiao Ni <xni@redhat.com>
      Tested-by: default avatarMariusz Dabrowski <mariusz.dabrowski@intel.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      744889b7
    • Eric Sandeen's avatar
      fscache: Fix out of bound read in long cookie keys · fa520c47
      Eric Sandeen authored
      fscache_set_key() can incur an out-of-bounds read, reported by KASAN:
      
       BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache]
       Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615
      
      and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236
      
        BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline]
        BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171
        Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466
      
      This happens for any index_key_len which is not divisible by 4 and is
      larger than the size of the inline key, because the code allocates exactly
      index_key_len for the key buffer, but the hashing loop is stepping through
      it 4 bytes (u32) at a time in the buf[] array.
      
      Fix this by calculating how many u32 buffers we'll need by using
      DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation
      buffer to hold the index_key, then using that same count as the hashing
      index limit.
      
      Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies")
      Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa520c47
    • David Howells's avatar
      fscache: Fix incomplete initialisation of inline key space · 1ff22883
      David Howells authored
      The inline key in struct rxrpc_cookie is insufficiently initialized,
      zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15
      bytes will end up hashing uninitialized memory because the memcpy only
      partially fills the last buf[] element.
      
      Fix this by clearing fscache_cookie objects on allocation rather than using
      the slab constructor to initialise them.  We're going to pretty much fill
      in the entire struct anyway, so bringing it into our dcache writably
      shouldn't incur much overhead.
      
      This removes the need to do clearance in fscache_set_key() (where we aren't
      doing it correctly anyway).
      
      Also, we don't need to set cookie->key_len in fscache_set_key() as we
      already did it in the only caller, so remove that.
      
      Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies")
      Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com
      Reported-by: default avatarEric Sandeen <sandeen@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ff22883
    • Al Viro's avatar
      cachefiles: fix the race between cachefiles_bury_object() and rmdir(2) · 169b8033
      Al Viro authored
      the victim might've been rmdir'ed just before the lock_rename();
      unlike the normal callers, we do not look the source up after the
      parents are locked - we know it beforehand and just recheck that it's
      still the child of what used to be its parent.  Unfortunately,
      the check is too weak - we don't spot a dead directory since its
      ->d_parent is unchanged, dentry is positive, etc.  So we sail all
      the way to ->rename(), with hosting filesystems _not_ expecting
      to be asked renaming an rmdir'ed subdirectory.
      
      The fix is easy, fortunately - the lock on parent is sufficient for
      making IS_DEADDIR() on child safe.
      
      Cc: stable@vger.kernel.org
      Fixes: 9ae326a6 (CacheFiles: A cache that backs onto a mounted filesystem)
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      169b8033
    • Linus Torvalds's avatar
      mremap: properly flush TLB before releasing the page · eb66ae03
      Linus Torvalds authored
      Jann Horn points out that our TLB flushing was subtly wrong for the
      mremap() case.  What makes mremap() special is that we don't follow the
      usual "add page to list of pages to be freed, then flush tlb, and then
      free pages".  No, mremap() obviously just _moves_ the page from one page
      table location to another.
      
      That matters, because mremap() thus doesn't directly control the
      lifetime of the moved page with a freelist: instead, the lifetime of the
      page is controlled by the page table locking, that serializes access to
      the entry.
      
      As a result, we need to flush the TLB not just before releasing the lock
      for the source location (to avoid any concurrent accesses to the entry),
      but also before we release the destination page table lock (to avoid the
      TLB being flushed after somebody else has already done something to that
      page).
      
      This also makes the whole "need_flush" logic unnecessary, since we now
      always end up flushing the TLB for every valid entry.
      Reported-and-tested-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb66ae03