1. 06 Aug, 2014 19 commits
  2. 04 Aug, 2014 6 commits
    • David Vrabel's avatar
      x86/xen: safely map and unmap grant frames when in atomic context · 1cd4dd3a
      David Vrabel authored
      commit b7dd0e35 upstream.
      
      arch_gnttab_map_frames() and arch_gnttab_unmap_frames() are called in
      atomic context but were calling alloc_vm_area() which might sleep.
      
      Also, if a driver attempts to allocate a grant ref from an interrupt
      and the table needs expanding, then the CPU may already by in lazy MMU
      mode and apply_to_page_range() will BUG when it tries to re-enable
      lazy MMU mode.
      
      These two functions are only used in PV guests.
      
      Introduce arch_gnttab_init() to allocates the virtual address space in
      advance.
      
      Avoid the use of apply_to_page_range() by using saving and using the
      array of PTE addresses from the alloc_vm_area() call (which ensures
      that the required page tables are pre-allocated).
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      [ David Vrabel: Backported to 3.13.10. ]
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1cd4dd3a
    • Mengdong Lin's avatar
      ALSA: hda - verify pin:cvt connection on preparing a stream for Intel HDMI codec · 350cc11e
      Mengdong Lin authored
      commit 2df6742f upstream.
      
      This is a temporary fix for some Intel HDMI codecs to avoid no sound output for
      a resuming playback after S3.
      
      After S3, the audio driver restores pin:cvt connection selections by
      snd_hda_codec_resume_cache(). However this can happen before the gfx side is
      ready and such connect selection is overlooked by HW. After gfx is ready, the
      pins make the default selection again. And this will cause multiple pins share
      a same convertor and mute control will affect each other. Thus a resumed audio
      playback become silent after S3.
      
      This patch verifies pin:cvt connection on preparing a stream, to assure the pin
      selects the right convetor and an assigned convertor is not shared by other
      unused pins. Apply this fix-up on Haswell, Broadwell and Valleyview (Baytrail).
      
      We need this temporary fix before a reliable software communication channel is
      established between audio and gfx, to sync audio/gfx operations.
      Signed-off-by: default avatarMengdong Lin <mengdong.lin@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Cc: David Henningsson <david.henningsson@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      350cc11e
    • Mengdong Lin's avatar
      ALSA: hda - verify pin:converter connection on unsol event for HSW and VLV · 77d080e2
      Mengdong Lin authored
      commit b4f75aea upstream.
      
      This patch will verify the pin's coverter selection for an active stream
      when an unsol event reports this pin becomes available again after a display
      mode change or hot-plug event.
      
      For Haswell+ and Valleyview: display mode change or hot-plug can change the
      transcoder:port connection and make all the involved audio pins share the 1st
      converter. So the stream using 1st convertor will flow to multiple pins
      but active streams using other converters will fail. This workaround
      is to assure the pin selects the right conveter and an assigned converter is
      not shared by other unused pins.
      Signed-off-by: default avatarMengdong Lin <mengdong.lin@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Cc: David Henningsson <david.henningsson@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      77d080e2
    • Mengdong Lin's avatar
      ALSA: hda/hdmi - apply all Haswell fix-ups to Broadwell display codec · fbfc0ce8
      Mengdong Lin authored
      commit 75dcbe4d upstream.
      
      Broadwell and Haswell have the same behavior on display audio. So this patch
      defines is_haswell_plus() to include codecs for both Haswell and its successor
      Broadwell, and apply all Haswell fix-ups to Broadwell.
      Signed-off-by: default avatarMengdong Lin <mengdong.lin@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Cc: David Henningsson <david.henningsson@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fbfc0ce8
    • Peter Christensen's avatar
      ipvs: Fix panic due to non-linear skb · f8039cdc
      Peter Christensen authored
      commit f44a5f45 upstream.
      
      Receiving a ICMP response to an IPIP packet in a non-linear skb could
      cause a kernel panic in __skb_pull.
      
      The problem was introduced in
      commit f2edb9f7 ("ipvs: implement
      passive PMTUD for IPIP packets").
      Signed-off-by: default avatarPeter Christensen <pch@ordbogen.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f8039cdc
    • Wei-Chun Chao's avatar
      net: fix UDP tunnel GSO of frag_list GRO packets · 8e7967e7
      Wei-Chun Chao authored
      commit 5882a07c upstream.
      
      This patch fixes a kernel BUG_ON in skb_segment. It is hit when
      testing two VMs on openvswitch with one VM acting as VXLAN gateway.
      
      During VXLAN packet GSO, skb_segment is called with skb->data
      pointing to inner TCP payload. skb_segment calls skb_network_protocol
      to retrieve the inner protocol. skb_network_protocol actually expects
      skb->data to point to MAC and it calls pskb_may_pull with ETH_HLEN.
      This ends up pulling in ETH_HLEN data from header tail. As a result,
      pskb_trim logic is skipped and BUG_ON is hit later.
      
      Move skb_push in front of skb_network_protocol so that skb->data
      lines up properly.
      
      kernel BUG at net/core/skbuff.c:2999!
      Call Trace:
      [<ffffffff816ac412>] tcp_gso_segment+0x122/0x410
      [<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
      [<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
      [<ffffffff816b3658>] skb_udp_tunnel_segment+0xd8/0x390
      [<ffffffff816b3c00>] udp4_ufo_fragment+0x120/0x140
      [<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390
      [<ffffffff8109d742>] ? default_wake_function+0x12/0x20
      [<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170
      [<ffffffff8164b4d0>] __skb_gso_segment+0x60/0xc0
      [<ffffffff8164b6b3>] dev_hard_start_xmit+0x183/0x550
      [<ffffffff8166c91e>] sch_direct_xmit+0xfe/0x1d0
      [<ffffffff8164bc94>] __dev_queue_xmit+0x214/0x4f0
      [<ffffffff8164bf90>] dev_queue_xmit+0x10/0x20
      [<ffffffff81687edb>] ip_finish_output+0x66b/0x890
      [<ffffffff81688a58>] ip_output+0x58/0x90
      [<ffffffff816c628f>] ? fib_table_lookup+0x29f/0x350
      [<ffffffff816881c9>] ip_local_out_sk+0x39/0x50
      [<ffffffff816cbfad>] iptunnel_xmit+0x10d/0x130
      [<ffffffffa0212200>] vxlan_xmit_skb+0x1d0/0x330 [vxlan]
      [<ffffffffa02a3919>] vxlan_tnl_send+0x129/0x1a0 [openvswitch]
      [<ffffffffa02a2cd6>] ovs_vport_send+0x26/0xa0 [openvswitch]
      [<ffffffffa029931e>] do_output+0x2e/0x50 [openvswitch]
      Signed-off-by: default avatarWei-Chun Chao <weichunc@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (backported from commit 5882a07c)
      Signed-off-by: default avatarDave Chiluk <chiluk@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8e7967e7
  3. 29 Jul, 2014 4 commits
    • Hugh Dickins's avatar
      shmem: fix splicing from a hole while it's punched · 760f8da2
      Hugh Dickins authored
      commit b1a36650 upstream.
      
      shmem_fault() is the actual culprit in trinity's hole-punch starvation,
      and the most significant cause of such problems: since a page faulted is
      one that then appears page_mapped(), needing unmap_mapping_range() and
      i_mmap_mutex to be unmapped again.
      
      But it is not the only way in which a page can be brought into a hole in
      the radix_tree while that hole is being punched; and Vlastimil's testing
      implies that if enough other processors are busy filling in the hole,
      then shmem_undo_range() can be kept from completing indefinitely.
      
      shmem_file_splice_read() is the main other user of SGP_CACHE, which can
      instantiate shmem pagecache pages in the read-only case (without holding
      i_mutex, so perhaps concurrently with a hole-punch).  Probably it's
      silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
      ought to be safe, but might bring surprises - not a change to be rushed.
      
      shmem_read_mapping_page_gfp() is an internal interface used by
      drivers/gpu/drm GEM (and next by uprobes): it should be okay.  And
      shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
      called internally by the kernel (perhaps for a stacking filesystem,
      which might rely on holes to be reserved): it's unclear whether it could
      be provoked to keep hole-punch busy or not.
      
      We could apply the same umbrella as now used in shmem_fault() to
      shmem_file_splice_read() and the others; but it looks ugly, and use over
      a range raises questions - should it actually be per page? can these get
      starved themselves?
      
      The origin of this part of the problem is my v3.1 commit d0823576
      ("mm: pincer in truncate_inode_pages_range"), once it was duplicated
      into shmem.c.  It seemed like a nice idea at the time, to ensure
      (barring RCU lookup fuzziness) that there's an instant when the entire
      hole is empty; but the indefinitely repeated scans to ensure that make
      it vulnerable.
      
      Revert that "enhancement" to hole-punch from shmem_undo_range(), but
      retain the unproblematic rescanning when it's truncating; add a couple
      of comments there.
      
      Remove the "indices[0] >= end" test: that is now handled satisfactorily
      by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
      to be worth avoiding here.
      
      But if we do not always loop indefinitely, we do need to handle the case
      of swap swizzled back to page before shmem_free_swap() gets it: add a
      retry for that case, as suggested by Konstantin Khlebnikov; and for the
      case of page swizzled back to swap, as suggested by Johannes Weiner.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Suggested-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [ luis: backported to 3.11: used hughd's backport to 3.10.50 ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      760f8da2
    • Hugh Dickins's avatar
      shmem: fix faulting into a hole, not taking i_mutex · e0533ae3
      Hugh Dickins authored
      commit 8e205f77 upstream.
      
      Commit f00cdc6d ("shmem: fix faulting into a hole while it's
      punched") was buggy: Sasha sent a lockdep report to remind us that
      grabbing i_mutex in the fault path is a no-no (write syscall may already
      hold i_mutex while faulting user buffer).
      
      We tried a completely different approach (see following patch) but that
      proved inadequate: good enough for a rational workload, but not good
      enough against trinity - which forks off so many mappings of the object
      that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
      into serious starvation when concurrent faults force the puncher to fall
      back to single-page unmap_mapping_range() searches of the i_mmap tree.
      
      So return to the original umbrella approach, but keep away from i_mutex
      this time.  We really don't want to bloat every shmem inode with a new
      mutex or completion, just to protect this unlikely case from trinity.
      So extend the original with wait_queue_head on stack at the hole-punch
      end, and wait_queue item on the stack at the fault end.
      
      This involves further use of i_lock to guard against the races: lockdep
      has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
      i_lock around wake_up_bit(), which is comparable to what we do here.
      i_lock is more convenient, but we could switch to shmem's info->lock.
      
      This issue has been tagged with CVE-2014-4171, which will require commit
      f00cdc6d and this and the following patch to be backported: we
      suggest to 3.1+, though in fact the trinity forkbomb effect might go
      back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
      not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
      Anyone running trinity on 3.0 and earlier? I don't think we need care.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lukas Czerner <lczerner@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e0533ae3
    • Hugh Dickins's avatar
      shmem: fix faulting into a hole while it's punched · bca9495d
      Hugh Dickins authored
      commit f00cdc6d upstream.
      
      Trinity finds that mmap access to a hole while it's punched from shmem
      can prevent the madvise(MADV_REMOVE) or fallocate(FALLOC_FL_PUNCH_HOLE)
      from completing, until the reader chooses to stop; with the puncher's
      hold on i_mutex locking out all other writers until it can complete.
      
      It appears that the tmpfs fault path is too light in comparison with its
      hole-punching path, lacking an i_data_sem to obstruct it; but we don't
      want to slow down the common case.
      
      Extend shmem_fallocate()'s existing range notification mechanism, so
      shmem_fault() can refrain from faulting pages into the hole while it's
      punched, waiting instead on i_mutex (when safe to sleep; or repeatedly
      faulting when not).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      bca9495d
    • Sven Wegener's avatar
      x86_32, entry: Store badsys error code in %eax · 5eb2fae2
      Sven Wegener authored
      commit 8142b215 upstream.
      
      Commit 554086d8 ("x86_32, entry: Do syscall exit work on badsys
      (CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
      code, resulting in syscall() not returning proper errors for undefined
      syscalls on CPUs supporting the sysenter feature.
      
      The following code:
      
      > int result = syscall(666);
      > printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));
      
      results in:
      
      > result=666 errno=0 error=Success
      
      Obviously, the syscall return value is the called syscall number, but it
      should have been an ENOSYS error. When run under ptrace it behaves
      correctly, which makes it hard to debug in the wild:
      
      > result=-1 errno=38 error=Function not implemented
      
      The %eax register is the return value register. For debugging via ptrace
      the syscall entry code stores the complete register context on the
      stack. The badsys handlers only store the ENOSYS error code in the
      ptrace register set and do not set %eax like a regular syscall handler
      would. The old resume_userspace call chain contains code that clobbers
      %eax and it restores %eax from the ptrace registers afterwards. The same
      goes for the ptrace-enabled call chain. When ptrace is not used, the
      syscall return value is the passed-in syscall number from the untouched
      %eax register.
      
      Use %eax as the return value register in syscall_badsys and
      sysenter_badsys, like a real syscall handler does, and have the caller
      push the value onto the stack for ptrace access.
      Signed-off-by: default avatarSven Wegener <sven.wegener@stealer.net>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1407221022380.31021@titan.int.lan.stealer.netReviewed-and-tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5eb2fae2
  4. 24 Jul, 2014 1 commit
  5. 21 Jul, 2014 4 commits
    • Xufeng Zhang's avatar
      sctp: Fix sk_ack_backlog wrap-around problem · 56ac66c9
      Xufeng Zhang authored
      commit d3217b15 upstream.
      
      Consider the scenario:
      For a TCP-style socket, while processing the COOKIE_ECHO chunk in
      sctp_sf_do_5_1D_ce(), after it has passed a series of sanity check,
      a new association would be created in sctp_unpack_cookie(), but afterwards,
      some processing maybe failed, and sctp_association_free() will be called to
      free the previously allocated association, in sctp_association_free(),
      sk_ack_backlog value is decremented for this socket, since the initial
      value for sk_ack_backlog is 0, after the decrement, it will be 65535,
      a wrap-around problem happens, and if we want to establish new associations
      afterward in the same socket, ABORT would be triggered since sctp deem the
      accept queue as full.
      Fix this issue by only decrementing sk_ack_backlog for associations in
      the endpoint's list.
      Fix-suggested-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarXufeng Zhang <xufeng.zhang@windriver.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Reference: CVE-2014-4667
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      56ac66c9
    • Nicholas Bellinger's avatar
      target: Explicitly clear ramdisk_mcp backend pages · ff5d076f
      Nicholas Bellinger authored
      [Note that a different patch to address the same issue went in during
      v3.15-rc1 (commit 4442dc8a), but includes a bunch of other changes that
      don't strictly apply to fixing the bug.]
      
      This patch changes rd_allocate_sgl_table() to explicitly clear
      ramdisk_mcp backend memory pages by passing __GFP_ZERO into
      alloc_pages().
      
      This addresses a potential security issue where reading from a
      ramdisk_mcp could return sensitive information, and follows what
      >= v3.15 does to explicitly clear ramdisk_mcp memory at backend
      device initialization time.
      Reported-by: default avatarJorge Daniel Sequeira Matias <jdsm@tecnico.ulisboa.pt>
      Cc: Jorge Daniel Sequeira Matias <jdsm@tecnico.ulisboa.pt>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Reference: CVE-2014-4027
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ff5d076f
    • Paolo Bonzini's avatar
      KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155) · bad3f63f
      Paolo Bonzini authored
      commit 5678de3f upstream.
      
      QE reported that they got the BUG_ON in ioapic_service to trigger.
      I cannot reproduce it, but there are two reasons why this could happen.
      
      The less likely but also easiest one, is when kvm_irq_delivery_to_apic
      does not deliver to any APIC and returns -1.
      
      Because irqe.shorthand == 0, the kvm_for_each_vcpu loop in that
      function is never reached.  However, you can target the similar loop in
      kvm_irq_delivery_to_apic_fast; just program a zero logical destination
      address into the IOAPIC, or an out-of-range physical destination address.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      bad3f63f
    • Tony Camuso's avatar
      ACPI / PAD: call schedule() when need_resched() is true · a0c72a16
      Tony Camuso authored
      commit 5b59c69e upstream.
      
      The purpose of the acpi_pad driver is to implement the "processor power
      aggregator" device as described in the ACPI 4.0 spec section 8.5. It
      takes requests from the BIOS (via ACPI) to put a specified number of
      CPUs into idle, in order to save power, until further notice.
      
      It does this by creating high-priority threads that try to keep the CPUs
      in a high C-state (using the monitor/mwait CPU instructions). The
      mwait() call is in a loop that checks periodically if the thread should
      end and a few other things.
      
      It was discovered through testing that the power_saving threads were
      causing the system to consume more power than the system was consuming
      before the threads were created. A counter in the main loop of
      power_saving_thread() revealed that it was spinning. The mwait()
      instruction was not keeping the CPU in a high C state very much if at
      all.
      
      Here is a simplification of the loop in function power_saving_thread() in
      drivers/acpi/acpi_pad.c
      
          while (!kthread_should_stop()) {
               :
              try_to_freeze()
               :
              while (!need_resched()) {
                   :
                  if (!need_resched())
                      __mwait(power_saving_mwait_eax, 1);
                   :
                  if (jiffies > expire_time) {
                      do_sleep = 1;
                      break;
                  }
              }
          }
      
      If need_resched() returns true, then mwait() is not called. It was
      returning true because of things like timer interrupts, as in the
      following sequence.
      
      hrtimer_interrupt->__run_hrtimer->tick_sched_timer-> update_process_times->
      rcu_check_callbacks->rcu_pending->__rcu_pending->set_need_resched
      
      Kernels 3.5.0-rc2+ do not exhibit this problem, because a patch to
      try_to_freeze() in include/linux/freezer.h introduces a call to
      might_sleep(), which ultimately calls schedule() to clear the reschedule
      flag and allows the the loop to execute the call to mwait().
      
      However, the changes to try_to_freeze are unrelated to acpi_pad, and it
      does not seem like a good idea to rely on an unrelated patch in a
      function that could later be changed and reintroduce this bug.
      
      Therefore, it seems better to make an explicit call to schedule() in the
      outer loop when the need_resched flag is set.
      Reported-and-tested-by: default avatarStuart Hayes <stuart_hayes@dell.com>
      Signed-off-by: default avatarTony Camuso <tcamuso@redhat.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Leann Ogasawara <leann.ogasawara@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a0c72a16
  6. 18 Jul, 2014 1 commit
  7. 16 Jul, 2014 5 commits