1. 29 Apr, 2020 37 commits
  2. 23 Apr, 2020 3 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.118 · 7edd66cf
      Greg Kroah-Hartman authored
      7edd66cf
    • Daniel Borkmann's avatar
      bpf: fix buggy r0 retval refinement for tracing helpers · e0b80b7d
      Daniel Borkmann authored
      [ no upstream commit ]
      
      See the glory details in 10060503 ("bpf: Verifier, do_refine_retval_range
      may clamp umin to 0 incorrectly") for why 849fa506 ("bpf/verifier: refine
      retval R0 state for bpf_get_stack helper") is buggy. The whole series however
      is not suitable for stable since it adds significant amount [0] of verifier
      complexity in order to add 32bit subreg tracking. Something simpler is needed.
      
      Unfortunately, reverting 849fa506 ("bpf/verifier: refine retval R0 state
      for bpf_get_stack helper") or just cherry-picking 10060503 ("bpf: Verifier,
      do_refine_retval_range may clamp umin to 0 incorrectly") is not an option since
      it will break existing tracing programs badly (at least those that are using
      bpf_get_stack() and bpf_probe_read_str() helpers). Not fixing it in stable is
      also not an option since on 4.19 kernels an error will cause a soft-lockup due
      to hitting dead-code sanitized branch since we don't hard-wire such branches
      in old kernels yet. But even then for 5.x 849fa506 ("bpf/verifier: refine
      retval R0 state for bpf_get_stack helper") would cause wrong bounds on the
      verifier simluation when an error is hit.
      
      In one of the earlier iterations of mentioned patch series for upstream there
      was the concern that just using smax_value in do_refine_retval_range() would
      nuke bounds by subsequent <<32 >>32 shifts before the comparison against 0 [1]
      which eventually led to the 32bit subreg tracking in the first place. While I
      initially went for implementing the idea [1] to pattern match the two shift
      operations, it turned out to be more complex than actually needed, meaning, we
      could simply treat do_refine_retval_range() similarly to how we branch off
      verification for conditionals or under speculation, that is, pushing a new
      reg state to the stack for later verification. This means, instead of verifying
      the current path with the ret_reg in [S32MIN, msize_max_value] interval where
      later bounds would get nuked, we split this into two: i) for the success case
      where ret_reg can be in [0, msize_max_value], and ii) for the error case with
      ret_reg known to be in interval [S32MIN, -1]. Latter will preserve the bounds
      during these shift patterns and can match reg < 0 test. test_progs also succeed
      with this approach.
      
        [0] https://lore.kernel.org/bpf/158507130343.15666.8018068546764556975.stgit@john-Precision-5820-Tower/
        [1] https://lore.kernel.org/bpf/158015334199.28573.4940395881683556537.stgit@john-XPS-13-9370/T/#m2e0ad1d5949131014748b6daa48a3495e7f0456d
      
      Fixes: 849fa506 ("bpf/verifier: refine retval R0 state for bpf_get_stack helper")
      Reported-by: default avatarLorenzo Fontana <fontanalorenz@gmail.com>
      Reported-by: default avatarLeonardo Di Donato <leodidonato@gmail.com>
      Reported-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Tested-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Tested-by: default avatarLorenzo Fontana <fontanalorenz@gmail.com>
      Tested-by: default avatarLeonardo Di Donato <leodidonato@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0b80b7d
    • Waiman Long's avatar
      KEYS: Don't write out to userspace while holding key semaphore · 18779eac
      Waiman Long authored
      commit d3ec10aa upstream.
      
      A lockdep circular locking dependency report was seen when running a
      keyutils test:
      
      [12537.027242] ======================================================
      [12537.059309] WARNING: possible circular locking dependency detected
      [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE    --------- -  -
      [12537.125253] ------------------------------------------------------
      [12537.153189] keyctl/25598 is trying to acquire lock:
      [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0
      [12537.208365]
      [12537.208365] but task is already holding lock:
      [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220
      [12537.270476]
      [12537.270476] which lock already depends on the new lock.
      [12537.270476]
      [12537.307209]
      [12537.307209] the existing dependency chain (in reverse order) is:
      [12537.340754]
      [12537.340754] -> #3 (&type->lock_class){++++}:
      [12537.367434]        down_write+0x4d/0x110
      [12537.385202]        __key_link_begin+0x87/0x280
      [12537.405232]        request_key_and_link+0x483/0xf70
      [12537.427221]        request_key+0x3c/0x80
      [12537.444839]        dns_query+0x1db/0x5a5 [dns_resolver]
      [12537.468445]        dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs]
      [12537.496731]        cifs_reconnect+0xe04/0x2500 [cifs]
      [12537.519418]        cifs_readv_from_socket+0x461/0x690 [cifs]
      [12537.546263]        cifs_read_from_socket+0xa0/0xe0 [cifs]
      [12537.573551]        cifs_demultiplex_thread+0x311/0x2db0 [cifs]
      [12537.601045]        kthread+0x30c/0x3d0
      [12537.617906]        ret_from_fork+0x3a/0x50
      [12537.636225]
      [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}:
      [12537.664525]        __mutex_lock+0x105/0x11f0
      [12537.683734]        request_key_and_link+0x35a/0xf70
      [12537.705640]        request_key+0x3c/0x80
      [12537.723304]        dns_query+0x1db/0x5a5 [dns_resolver]
      [12537.746773]        dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs]
      [12537.775607]        cifs_reconnect+0xe04/0x2500 [cifs]
      [12537.798322]        cifs_readv_from_socket+0x461/0x690 [cifs]
      [12537.823369]        cifs_read_from_socket+0xa0/0xe0 [cifs]
      [12537.847262]        cifs_demultiplex_thread+0x311/0x2db0 [cifs]
      [12537.873477]        kthread+0x30c/0x3d0
      [12537.890281]        ret_from_fork+0x3a/0x50
      [12537.908649]
      [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}:
      [12537.935225]        __mutex_lock+0x105/0x11f0
      [12537.954450]        cifs_call_async+0x102/0x7f0 [cifs]
      [12537.977250]        smb2_async_readv+0x6c3/0xc90 [cifs]
      [12538.000659]        cifs_readpages+0x120a/0x1e50 [cifs]
      [12538.023920]        read_pages+0xf5/0x560
      [12538.041583]        __do_page_cache_readahead+0x41d/0x4b0
      [12538.067047]        ondemand_readahead+0x44c/0xc10
      [12538.092069]        filemap_fault+0xec1/0x1830
      [12538.111637]        __do_fault+0x82/0x260
      [12538.129216]        do_fault+0x419/0xfb0
      [12538.146390]        __handle_mm_fault+0x862/0xdf0
      [12538.167408]        handle_mm_fault+0x154/0x550
      [12538.187401]        __do_page_fault+0x42f/0xa60
      [12538.207395]        do_page_fault+0x38/0x5e0
      [12538.225777]        page_fault+0x1e/0x30
      [12538.243010]
      [12538.243010] -> #0 (&mm->mmap_sem){++++}:
      [12538.267875]        lock_acquire+0x14c/0x420
      [12538.286848]        __might_fault+0x119/0x1b0
      [12538.306006]        keyring_read_iterator+0x7e/0x170
      [12538.327936]        assoc_array_subtree_iterate+0x97/0x280
      [12538.352154]        keyring_read+0xe9/0x110
      [12538.370558]        keyctl_read_key+0x1b9/0x220
      [12538.391470]        do_syscall_64+0xa5/0x4b0
      [12538.410511]        entry_SYSCALL_64_after_hwframe+0x6a/0xdf
      [12538.435535]
      [12538.435535] other info that might help us debug this:
      [12538.435535]
      [12538.472829] Chain exists of:
      [12538.472829]   &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class
      [12538.472829]
      [12538.524820]  Possible unsafe locking scenario:
      [12538.524820]
      [12538.551431]        CPU0                    CPU1
      [12538.572654]        ----                    ----
      [12538.595865]   lock(&type->lock_class);
      [12538.613737]                                lock(root_key_user.cons_lock);
      [12538.644234]                                lock(&type->lock_class);
      [12538.672410]   lock(&mm->mmap_sem);
      [12538.687758]
      [12538.687758]  *** DEADLOCK ***
      [12538.687758]
      [12538.714455] 1 lock held by keyctl/25598:
      [12538.732097]  #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220
      [12538.770573]
      [12538.770573] stack backtrace:
      [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G
      [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
      [12538.881963] Call Trace:
      [12538.892897]  dump_stack+0x9a/0xf0
      [12538.907908]  print_circular_bug.isra.25.cold.50+0x1bc/0x279
      [12538.932891]  ? save_trace+0xd6/0x250
      [12538.948979]  check_prev_add.constprop.32+0xc36/0x14f0
      [12538.971643]  ? keyring_compare_object+0x104/0x190
      [12538.992738]  ? check_usage+0x550/0x550
      [12539.009845]  ? sched_clock+0x5/0x10
      [12539.025484]  ? sched_clock_cpu+0x18/0x1e0
      [12539.043555]  __lock_acquire+0x1f12/0x38d0
      [12539.061551]  ? trace_hardirqs_on+0x10/0x10
      [12539.080554]  lock_acquire+0x14c/0x420
      [12539.100330]  ? __might_fault+0xc4/0x1b0
      [12539.119079]  __might_fault+0x119/0x1b0
      [12539.135869]  ? __might_fault+0xc4/0x1b0
      [12539.153234]  keyring_read_iterator+0x7e/0x170
      [12539.172787]  ? keyring_read+0x110/0x110
      [12539.190059]  assoc_array_subtree_iterate+0x97/0x280
      [12539.211526]  keyring_read+0xe9/0x110
      [12539.227561]  ? keyring_gc_check_iterator+0xc0/0xc0
      [12539.249076]  keyctl_read_key+0x1b9/0x220
      [12539.266660]  do_syscall_64+0xa5/0x4b0
      [12539.283091]  entry_SYSCALL_64_after_hwframe+0x6a/0xdf
      
      One way to prevent this deadlock scenario from happening is to not
      allow writing to userspace while holding the key semaphore. Instead,
      an internal buffer is allocated for getting the keys out from the
      read method first before copying them out to userspace without holding
      the lock.
      
      That requires taking out the __user modifier from all the relevant
      read methods as well as additional changes to not use any userspace
      write helpers. That is,
      
        1) The put_user() call is replaced by a direct copy.
        2) The copy_to_user() call is replaced by memcpy().
        3) All the fault handling code is removed.
      
      Compiling on a x86-64 system, the size of the rxrpc_read() function is
      reduced from 3795 bytes to 2384 bytes with this patch.
      
      Fixes: ^1da177e4 ("Linux-2.6.12-rc2")
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18779eac