1. 27 Apr, 2016 20 commits
  2. 21 Mar, 2016 20 commits
    • Zefan Li's avatar
      Linux 3.4.111 · 3389604d
      Zefan Li authored
      3389604d
    • Marcelo Tosatti's avatar
      KVM: x86: move steal time initialization to vcpu entry time · 6d470d7c
      Marcelo Tosatti authored
      commit 7cae2bed upstream.
      
      As reported at https://bugs.launchpad.net/qemu/+bug/1494350,
      it is possible to have vcpu->arch.st.last_steal initialized
      from a thread other than vcpu thread, say the iothread, via
      KVM_SET_MSRS.
      
      Which can cause an overflow later (when subtracting from vcpu threads
      sched_info.run_delay).
      
      To avoid that, move steal time accumulation to vcpu entry time,
      before copying steal time data to guest.
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Reviewed-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      6d470d7c
    • Joe Thornber's avatar
      dm btree remove: fix a bug when rebalancing nodes after removal · 24fa51bd
      Joe Thornber authored
      commit 2871c69e upstream.
      
      Commit 4c7e3093 ("dm btree remove: fix bug in redistribute3") wasn't
      a complete fix for redistribute3().
      
      The redistribute3 function takes 3 btree nodes and shares out the entries
      evenly between them.  If the three nodes in total contained
      (MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting
      rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in
      the center.
      
      Fix this issue by being more careful about calculating the target number
      of entries for the left and right nodes.
      
      Unit tested in userspace using this program:
      https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.cSigned-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      24fa51bd
    • John Youn's avatar
      usb: dwc3: Fix assignment of EP transfer resources · f43d490b
      John Youn authored
      commit c4509601 upstream.
      
      The assignement of EP transfer resources was not handled properly in the
      dwc3 driver. Commit aebda618 ("usb: dwc3: Reset the transfer
      resource index on SET_INTERFACE") previously fixed one aspect of this
      where resources may be exhausted with multiple calls to SET_INTERFACE.
      However, it introduced an issue where composite devices with multiple
      interfaces can be assigned the same transfer resources for different
      endpoints. This patch solves both issues.
      
      The assignment of transfer resources cannot perfectly follow the data
      book due to the fact that the controller driver does not have all
      knowledge of the configuration in advance. It is given this information
      piecemeal by the composite gadget framework after every
      SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook
      programming model in this scenario can cause errors. For two reasons:
      
      1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION and
      SET_INTERFACE (8.1.5). This is incorrect in the scenario of multiple
      interfaces.
      
      2) The databook does not mention doing more DEPXFERCFG for new endpoint
      on alt setting (8.1.6).
      
      The following simplified method is used instead:
      
      All hardware endpoints can be assigned a transfer resource and this
      setting will stay persistent until either a core reset or hibernation.
      So whenever we do a DEPSTARTCFG(0) we can go ahead and do DEPXFERCFG for
      every hardware endpoint as well. We are guaranteed that there are as
      many transfer resources as endpoints.
      
      This patch triggers off of the calling dwc3_gadget_start_config() for
      EP0-out, which always happens first, and which should only happen in one
      of the above conditions.
      
      Fixes: aebda618 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE")
      Reported-by: default avatarRavi Babu <ravibabu@ti.com>
      Signed-off-by: default avatarJohn Youn <johnyoun@synopsys.com>
      Signed-off-by: default avatarFelipe Balbi <balbi@kernel.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      f43d490b
    • Anssi Hannula's avatar
      ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest DragonFly · d359d1d2
      Anssi Hannula authored
      commit 42e3121d upstream.
      
      AudioQuest DragonFly DAC reports a volume control range of 0..50
      (0x0000..0x0032) which in USB Audio means a range of 0 .. 0.2dB, which
      is obviously incorrect and would cause software using the dB information
      in e.g. volume sliders to have a massive volume difference in 100..102%
      range.
      
      Commit 2d1cb7f6 ("ALSA: usb-audio: add dB range mapping for some
      devices") added a dB range mapping for it with range 0..50 dB.
      
      However, the actual volume mapping seems to be neither linear volume nor
      linear dB scale, but instead quite close to the cubic mapping e.g.
      alsamixer uses, with a range of approx. -53...0 dB.
      
      Replace the previous quirk with a custom dB mapping based on some basic
      output measurements, using a 10-item range TLV (which will still fit in
      alsa-lib MAX_TLV_RANGE_SIZE).
      
      Tested on AudioQuest DragonFly HW v1.2. The quirk is only applied if the
      range is 0..50, so if this gets fixed/changed in later HW revisions it
      will no longer be applied.
      
      v2: incorporated Takashi Iwai's suggestion for the quirk application
      method
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@iki.fi>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [lizf: Backoported to 3.4: use dev_info() instead of usb_audio_info()]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      d359d1d2
    • Clemens Ladisch's avatar
      ALSA: tlv: add DECLARE_TLV_DB_RANGE() · ba8a85ef
      Clemens Ladisch authored
      commit bf1d1c9b upstream.
      
      Add a DECLARE_TLV_DB_RANGE() macro so that dB range information
      can be specified without having to count the items manually for
      TLV_DB_RANGE_HEAD().
      Signed-off-by: default avatarClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      ba8a85ef
    • Clemens Ladisch's avatar
      ALSA: tlv: compute TLV_*_ITEM lengths automatically · 320f8303
      Clemens Ladisch authored
      commit b5b9eb54 upstream.
      
      Add helper macros with a little bit of preprocessor magic to
      automatically compute the length of a TLV item.  This lets us avoid
      having to compute this by hand, and will allow to use items that do
      not use a fixed length.
      Signed-off-by: default avatarClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      320f8303
    • Jan Beulich's avatar
      x86/LDT: Print the real LDT base address · 0a165ad2
      Jan Beulich authored
      commit 0d430e3f upstream.
      
      This was meant to print base address and entry count; make it do so
      again.
      
      Fixes: 37868fe1 "x86/ldt: Make modify_ldt synchronous"
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Link: http://lkml.kernel.org/r/56797D8402000078000C24F0@prv-mh.provo.novell.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      0a165ad2
    • Rainer Weikusat's avatar
      af_unix: Guard against other == sk in unix_dgram_sendmsg · 78578995
      Rainer Weikusat authored
      commit a5527dda upstream.
      
      The unix_dgram_sendmsg routine use the following test
      
      if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
      
      to determine if sk and other are in an n:1 association (either
      established via connect or by using sendto to send messages to an
      unrelated socket identified by address). This isn't correct as the
      specified address could have been bound to the sending socket itself or
      because this socket could have been connected to itself by the time of
      the unix_peer_get but disconnected before the unix_state_lock(other). In
      both cases, the if-block would be entered despite other == sk which
      might either block the sender unintentionally or lead to trying to unlock
      the same spin lock twice for a non-blocking send. Add a other != sk
      check to guard against this.
      
      Fixes: 7d267278 ("unix: avoid use-after-free in ep_remove_wait_queue")
      Reported-By: default avatarPhilipp Hahn <pmhahn@pmhahn.de>
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mobileactivedefense.com>
      Tested-by: default avatarPhilipp Hahn <pmhahn@pmhahn.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      78578995
    • Hannes Frederic Sowa's avatar
      net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration · ca7d623e
      Hannes Frederic Sowa authored
      commit 7bbadd2d upstream.
      
      Docbook does not like the definition of macros inside a field declaration
      and adds a warning. Move the definition out.
      
      Fixes: 79462ad0 ("net: add validation for the socket syscall protocol argument")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      ca7d623e
    • Ben Zhang's avatar
      kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one · 40570888
      Ben Zhang authored
      commit 62572e29 upstream.
      
      I ran into a scenario where while one cpu was stuck and should have
      panic'd because of the NMI watchdog, it didn't.  The reason was another
      cpu was spewing stack dumps on to the console.  Upon investigation, I
      noticed that when writing to the console and also when dumping the
      stack, the watchdog is touched.
      
      This causes all the cpus to reset their NMI watchdog flags and the
      'stuck' cpu just spins forever.
      
      This change causes the semantics of touch_nmi_watchdog to be changed
      slightly.  Previously, I accidentally changed the semantics and we
      noticed there was a codepath in which touch_nmi_watchdog could be
      touched from a preemtible area.  That caused a BUG() to happen when
      CONFIG_DEBUG_PREEMPT was enabled.  I believe it was the acpi code.
      
      My attempt here re-introduces the change to have the
      touch_nmi_watchdog() code only touch the local cpu instead of all of the
      cpus.  But instead of using __get_cpu_var(), I use the
      __raw_get_cpu_var() version.
      
      This avoids the preemption problem.  However my reasoning wasn't because
      I was trying to be lazy.  Instead I rationalized it as, well if
      preemption is enabled then interrupts should be enabled to and the NMI
      watchdog will have no reason to trigger.  So it won't matter if the
      wrong cpu is touched because the percpu interrupt counters the NMI
      watchdog uses should still be incrementing.
      
      Don said:
      
      : I'm ok with this patch, though it does alter the behaviour of how
      : touch_nmi_watchdog works.  For the most part I don't think most callers
      : need to touch all of the watchdogs (on each cpu).  Perhaps a corner case
      : will pop up (the scheduler??  to mimic touch_all_softlockup_watchdogs() ).
      :
      : But this does address an issue where if a system is locked up and one cpu
      : is spewing out useful debug messages (or error messages), the hard lockup
      : will fail to go off.  We have seen this on RHEL also.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarBen Zhang <benzh@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      40570888
    • Michal Kubeček's avatar
      ipv6: prevent fib6_run_gc() contention · 5317d9af
      Michal Kubeček authored
      commit 2ac3ac8f upstream.
      
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      5317d9af
    • Neil Brown's avatar
      SUNRPC: never enqueue a ->rq_cong request on ->sending · 31469735
      Neil Brown authored
      commit 29807318 upstream.
      
      If the sending queue has a task without ->rq_cong set at the front,
      and then a number of tasks with ->rq_cong set such that they use
      the entire congestion window, then the queue deadlocks.  The first
      entry cannot be processed until later entries complete.
      
      This scenario has been seen with a client using UDP to access a server,
      and the network connection breaking for a period of time - it doesn't
      recover.
      
      It never really makes sense for an ->rq_cong request to be on the ->sending
      queue, but it can happen when a request is being retried, and finds
      the transport if locked (XPRT_LOCKED).  In this case we simple call
      __xprt_put_cong() and the deadlock goes away.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      31469735
    • Sasha Levin's avatar
      atm: deal with setting entry before mkip was called · faf7b6b0
      Sasha Levin authored
      commit 34f5b006 upstream.
      
      If we didn't call ATMARP_MKIP before ATMARP_ENCAP the VCC descriptor is
      non-existant and we'll end up dereferencing a NULL ptr:
      
      [1033173.491930] kasan: GPF could be caused by NULL-ptr deref or user memory accessirq event stamp: 123386
      [1033173.493678] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [1033173.493689] Modules linked in:
      [1033173.493697] CPU: 9 PID: 23815 Comm: trinity-c64 Not tainted 4.2.0-next-20150911-sasha-00043-g353d875-dirty #2545
      [1033173.493706] task: ffff8800630c4000 ti: ffff880063110000 task.ti: ffff880063110000
      [1033173.493823] RIP: clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
      [1033173.493826] RSP: 0018:ffff880063117a88  EFLAGS: 00010203
      [1033173.493828] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 000000000000000c
      [1033173.493830] RDX: 0000000000000002 RSI: ffffffffb3f10720 RDI: 0000000000000014
      [1033173.493832] RBP: ffff880063117b80 R08: ffff88047574d9a4 R09: 0000000000000000
      [1033173.493834] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1000c622f53
      [1033173.493836] R13: ffff8800cb905500 R14: ffff8808d6da2000 R15: 00000000fffffdfd
      [1033173.493840] FS:  00007fa56b92d700(0000) GS:ffff880478000000(0000) knlGS:0000000000000000
      [1033173.493843] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [1033173.493845] CR2: 0000000000000000 CR3: 00000000630e8000 CR4: 00000000000006a0
      [1033173.493855] Stack:
      [1033173.493862]  ffffffffb0b60444 000000000000eaea 0000000041b58ab3 ffffffffb3c3ce32
      [1033173.493867]  ffffffffb0b6f3e0 ffffffffb0b60444 ffffffffb5ea2e50 1ffff1000c622f5e
      [1033173.493873]  ffff8800630c4cd8 00000000000ee09a ffffffffb3ec4888 ffffffffb5ea2de8
      [1033173.493874] Call Trace:
      [1033173.494108] do_vcc_ioctl (net/atm/ioctl.c:170)
      [1033173.494113] vcc_ioctl (net/atm/ioctl.c:189)
      [1033173.494116] svc_ioctl (net/atm/svc.c:605)
      [1033173.494200] sock_do_ioctl (net/socket.c:874)
      [1033173.494204] sock_ioctl (net/socket.c:958)
      [1033173.494244] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
      [1033173.494290] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
      [1033173.494295] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
      [1033173.494362] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 50 09 00 00 49 8b 9e 60 06 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 14 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 14 09 00
      All code
      
      ========
         0:   fa                      cli
         1:   48 c1 ea 03             shr    $0x3,%rdx
         5:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
         9:   0f 85 50 09 00 00       jne    0x95f
         f:   49 8b 9e 60 06 00 00    mov    0x660(%r14),%rbx
        16:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
        1d:   fc ff df
        20:   48 8d 7b 14             lea    0x14(%rbx),%rdi
        24:   48 89 fa                mov    %rdi,%rdx
        27:   48 c1 ea 03             shr    $0x3,%rdx
        2b:*  0f b6 04 02             movzbl (%rdx,%rax,1),%eax               <-- trapping instruction
        2f:   48 89 fa                mov    %rdi,%rdx
        32:   83 e2 07                and    $0x7,%edx
        35:   38 d0                   cmp    %dl,%al
        37:   7f 08                   jg     0x41
        39:   84 c0                   test   %al,%al
        3b:   0f 85 14 09 00 00       jne    0x955
      
      Code starting with the faulting instruction
      ===========================================
         0:   0f b6 04 02             movzbl (%rdx,%rax,1),%eax
         4:   48 89 fa                mov    %rdi,%rdx
         7:   83 e2 07                and    $0x7,%edx
         a:   38 d0                   cmp    %dl,%al
         c:   7f 08                   jg     0x16
         e:   84 c0                   test   %al,%al
        10:   0f 85 14 09 00 00       jne    0x92a
      [1033173.494366] RIP clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
      [1033173.494368]  RSP <ffff880063117a88>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      faf7b6b0
    • Andrey Vagin's avatar
      netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get · 9b3dd23a
      Andrey Vagin authored
      commit c6825c09 upstream.
      
      Lets look at destroy_conntrack:
      
      hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
      ...
      nf_conntrack_free(ct)
      	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
      
      net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
      
      The hash is protected by rcu, so readers look up conntracks without
      locks.
      A conntrack is removed from the hash, but in this moment a few readers
      still can use the conntrack. Then this conntrack is released and another
      thread creates conntrack with the same address and the equal tuple.
      After this a reader starts to validate the conntrack:
      * It's not dying, because a new conntrack was created
      * nf_ct_tuple_equal() returns true.
      
      But this conntrack is not initialized yet, so it can not be used by two
      threads concurrently. In this case BUG_ON may be triggered from
      nf_nat_setup_info().
      
      Florian Westphal suggested to check the confirm bit too. I think it's
      right.
      
      task 1			task 2			task 3
      			nf_conntrack_find_get
      			 ____nf_conntrack_find
      destroy_conntrack
       hlist_nulls_del_rcu
       nf_conntrack_free
       kmem_cache_free
      						__nf_conntrack_alloc
      						 kmem_cache_alloc
      						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
      			 if (nf_ct_is_dying(ct))
      			 if (!nf_ct_tuple_equal()
      
      I'm not sure, that I have ever seen this race condition in a real life.
      Currently we are investigating a bug, which is reproduced on a few nodes.
      In our case one conntrack is initialized from a few tasks concurrently,
      we don't have any other explanation for this.
      
      <2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
      ...
      <4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
      ...
      <4>[46267.085549] Call Trace:
      <4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
      <4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
      <4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
      <4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
      <4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
      <4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
      <4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
      <4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
      <4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
      <4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
      <4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
      <4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
      <4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
      <4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
      <4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
      <4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
      <4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
      <4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
      <4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
      <4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
      <4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
      <4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
      <4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
      <4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
      <4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
      <4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
      <4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
      <4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
      <4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
      <1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      9b3dd23a
    • Hannes Frederic Sowa's avatar
      ipv6: probe routes asynchronous in rt6_probe · bea46f47
      Hannes Frederic Sowa authored
      commit c2f17e82 upstream.
      
      Routes need to be probed asynchronous otherwise the call stack gets
      exhausted when the kernel attemps to deliver another skb inline, like
      e.g. xt_TEE does, and we probe at the same time.
      
      We update neigh->updated still at once, otherwise we would send to
      many probes.
      
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bea46f47
    • bingtian.ly@taobao.com's avatar
      net: avoid to hang up on sending due to sysctl configuration overflow. · d865a115
      bingtian.ly@taobao.com authored
      commit cdda8891 upstream.
      
          I found if we write a larger than 4GB value to some sysctl
      variables, the sending syscall will hang up forever, because these
      variables are 32 bits, such large values make them overflow to 0 or
      negative.
      
          This patch try to fix overflow or prevent from zero value setup
      of below sysctl variables:
      
      net.core.wmem_default
      net.core.rmem_default
      
      net.core.rmem_max
      net.core.wmem_max
      
      net.ipv4.udp_rmem_min
      net.ipv4.udp_wmem_min
      
      net.ipv4.tcp_wmem
      net.ipv4.tcp_rmem
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarLi Yu <raise.sail@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      d865a115
    • Linus Torvalds's avatar
      Initialize msg/shm IPC objects before doing ipc_addid() · 3cda8eb0
      Linus Torvalds authored
      commit b9a53227 upstream.
      
      As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
      having initialized the IPC object state.  Yes, we initialize the IPC
      object in a locked state, but with all the lockless RCU lookup work,
      that IPC object lock no longer means that the state cannot be seen.
      
      We already did this for the IPC semaphore code (see commit e8577d1f:
      "ipc/sem.c: fully initialize sem_array before making it visible") but we
      clearly forgot about msg and shm.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3cda8eb0
    • Al Viro's avatar
      get rid of s_files and files_lock · f074c267
      Al Viro authored
      commit eee5cc27 upstream.
      
      The only thing we need it for is alt-sysrq-r (emergency remount r/o)
      and these days we can do just as well without going through the
      list of files.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      f074c267
    • Paolo Bonzini's avatar
      KVM: svm: unconditionally intercept #DB · 8f452aa3
      Paolo Bonzini authored
      commit cbdb967a upstream.
      
      This is needed to avoid the possibility that the guest triggers
      an infinite stream of #DB exceptions (CVE-2015-8104).
      
      VMX is not affected: because it does not save DR6 in the VMCS,
      it already intercepts #DB unconditionally.
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2, with thanks to Paolo:
       - update_db_bp_intercept() was called update_db_intercept()
       - The remaining call is in svm_guest_debug() rather than through svm_x86_ops]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      8f452aa3