1. 22 Oct, 2015 33 commits
  2. 18 Sep, 2015 7 commits
    • Zefan Li's avatar
      Linux 3.4.109 · 4a55c0cf
      Zefan Li authored
      4a55c0cf
    • Eric Dumazet's avatar
      udp: fix behavior of wrong checksums · 1c50a0ae
      Eric Dumazet authored
      commit beb39db5 upstream.
      
      We have two problems in UDP stack related to bogus checksums :
      
      1) We return -EAGAIN to application even if receive queue is not empty.
         This breaks applications using edge trigger epoll()
      
      2) Under UDP flood, we can loop forever without yielding to other
         processes, potentially hanging the host, especially on non SMP.
      
      This patch is an attempt to make things better.
      
      We might in the future add extra support for rt applications
      wanting to better control time spent doing a recv() in a hostile
      environment. For example we could validate checksums before queuing
      packets in socket receive queue.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      1c50a0ae
    • Thomas Gleixner's avatar
      sched: Queue RT tasks to head when prio drops · aaedb090
      Thomas Gleixner authored
      commit 81a44c54 upstream.
      
      The following scenario does not work correctly:
      
      Runqueue of CPUx contains two runnable and pinned tasks:
      
       T1: SCHED_FIFO, prio 80
       T2: SCHED_FIFO, prio 80
      
      T1 is on the cpu and executes the following syscalls (classic priority
      ceiling scenario):
      
       sys_sched_setscheduler(pid(T1), SCHED_FIFO, .prio = 90);
       ...
       sys_sched_setscheduler(pid(T1), SCHED_FIFO, .prio = 80);
       ...
      
      Now T1 gets preempted by T3 (SCHED_FIFO, prio 95). After T3 goes back
      to sleep the scheduler picks T2. Surprise!
      
      The same happens w/o actual preemption when T1 is forced into the
      scheduler due to a sporadic NEED_RESCHED event. The scheduler invokes
      pick_next_task() which returns T2. So T1 gets preempted and scheduled
      out.
      
      This happens because sched_setscheduler() dequeues T1 from the prio 90
      list and then enqueues it on the tail of the prio 80 list behind T2.
      This violates the POSIX spec and surprises user space which relies on
      the guarantee that SCHED_FIFO tasks are not scheduled out unless they
      give the CPU up voluntarily or are preempted by a higher priority
      task. In the latter case the preempted task must get back on the CPU
      after the preempting task schedules out again.
      
      We fixed a similar issue already in commit 60db48ca (sched: Queue a
      deboosted task to the head of the RT prio queue). The same treatment
      is necessary for sched_setscheduler(). So enqueue to head of the prio
      bucket list if the priority of the task is lowered.
      
      It might be possible that existing user space relies on the current
      behaviour, but it can be considered highly unlikely due to the corner
      case nature of the application scenario.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1391803122-4425-6-git-send-email-bigeasy@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      aaedb090
    • Ben Hutchings's avatar
      pipe: iovec: Fix memory corruption when retrying atomic copy as non-atomic · a39bf4a8
      Ben Hutchings authored
      pipe_iov_copy_{from,to}_user() may be tried twice with the same iovec,
      the first time atomically and the second time not.  The second attempt
      needs to continue from the iovec position, pipe buffer offset and
      remaining length where the first attempt failed, but currently the
      pipe buffer offset and remaining length are reset.  This will corrupt
      the piped data (possibly also leading to an information leak between
      processes) and may also corrupt kernel memory.
      
      This was fixed upstream by commits f0d1bec9 ("new helper:
      copy_page_from_iter()") and 637b58c2 ("switch pipe_read() to
      copy_page_to_iter()"), but those aren't suitable for stable.  This fix
      for older kernel versions was made by Seth Jennings for RHEL and I
      have extracted it from their update.
      
      CVE-2015-1805
      
      References: https://bugzilla.redhat.com/show_bug.cgi?id=1202855Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      a39bf4a8
    • Ralf Baechle's avatar
      NET: ROSE: Don't dereference NULL neighbour pointer. · bee5f3e2
      Ralf Baechle authored
      commit d496f784 upstream.
      
      A ROSE socket doesn't necessarily always have a neighbour pointer so check
      if the neighbour pointer is valid before dereferencing it.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Tested-by: default avatarBernard Pidoux <f6bvp@free.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bee5f3e2
    • Dan Williams's avatar
      block: fix ext_dev_lock lockdep report · bd3fa757
      Dan Williams authored
      commit 4d66e5e9 upstream.
      
       =================================
       [ INFO: inconsistent lock state ]
       4.1.0-rc7+ #217 Tainted: G           O
       ---------------------------------
       inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
       swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
        (ext_devt_lock){+.?...}, at: [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70
       {SOFTIRQ-ON-W} state was registered at:
         [<ffffffff810bf6b1>] __lock_acquire+0x461/0x1e70
         [<ffffffff810c1947>] lock_acquire+0xb7/0x290
         [<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
         [<ffffffff8143a07d>] blk_alloc_devt+0x6d/0xd0  <-- take the lock in process context
      [..]
        [<ffffffff810bf64e>] __lock_acquire+0x3fe/0x1e70
        [<ffffffff810c00ad>] ? __lock_acquire+0xe5d/0x1e70
        [<ffffffff810c1947>] lock_acquire+0xb7/0x290
        [<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
        [<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
        [<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
        [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70    <-- take the lock in softirq
        [<ffffffff8143bfec>] part_release+0x1c/0x50
        [<ffffffff8158edf6>] device_release+0x36/0xb0
        [<ffffffff8145ac2b>] kobject_cleanup+0x7b/0x1a0
        [<ffffffff8145aad0>] kobject_put+0x30/0x70
        [<ffffffff8158f147>] put_device+0x17/0x20
        [<ffffffff8143c29c>] delete_partition_rcu_cb+0x16c/0x180
        [<ffffffff8143c130>] ? read_dev_sector+0xa0/0xa0
        [<ffffffff810e0e0f>] rcu_process_callbacks+0x2ff/0xa90
        [<ffffffff810e0dcf>] ? rcu_process_callbacks+0x2bf/0xa90
        [<ffffffff81067e2e>] __do_softirq+0xde/0x600
      
      Neil sees this in his tests and it also triggers on pmem driver unbind
      for the libnvdimm tests.  This fix is on top of an initial fix by Keith
      for incorrect usage of mutex_lock() in this path: 2da78092 "block:
      Fix dev_t minor allocation lifetime".  Both this and 2da78092 are
      candidates for -stable.
      
      Fixes: 2da78092 ("block: Fix dev_t minor allocation lifetime")
      Cc: Keith Busch <keith.busch@intel.com>
      Reported-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bd3fa757
    • Vasily Averin's avatar
      bridge: superfluous skb->nfct check in br_nf_dev_queue_xmit · 59c4dd5e
      Vasily Averin authored
      commit aff09ce3 upstream.
      
      Currently bridge can silently drop ipv4 fragments.
      If node have loaded nf_defrag_ipv4 module but have no nf_conntrack_ipv4,
      br_nf_pre_routing defragments incoming ipv4 fragments
      but nfct check in br_nf_dev_queue_xmit does not allow re-fragment combined
      packet back, and therefore it is dropped in br_dev_queue_push_xmit without
      incrementing of any failcounters
      
      It seems the only way to hit the ip_fragment code in the bridge xmit
      path is to have a fragment list whose reassembled fragments go over
      the mtu. This only happens if nf_defrag is enabled. Thanks to
      Florian Westphal for providing feedback to clarify this.
      
      Defragmentation ipv4 is required not only in conntracks but at least in
      TPROXY target and socket match, therefore #ifdef is changed from
      NF_CONNTRACK_IPV4 to NF_DEFRAG_IPV4
      Signed-off-by: default avatarVasily Averin <vvs@openvz.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Cc: Kirill Tkhai <ktkhai@odin.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      59c4dd5e