1. 06 Mar, 2015 19 commits
  2. 20 Feb, 2015 21 commits
    • Ben Hutchings's avatar
      Linux 3.2.67 · fd623507
      Ben Hutchings authored
      fd623507
    • Myron Stowe's avatar
      PCI: Handle read-only BARs on AMD CS553x devices · a758de0e
      Myron Stowe authored
      commit 06cf35f9 upstream.
      
      Some AMD CS553x devices have read-only BARs because of a firmware or
      hardware defect.  There's a workaround in quirk_cs5536_vsa(), but it no
      longer works after 36e81648 ("PCI: Restore detection of read-only
      BARs").  Prior to 36e81648, we filled in res->start; afterwards we
      leave it zeroed out.  The quirk only updated the size, so the driver tried
      to use a region starting at zero, which didn't work.
      
      Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
      hard-code the sizes.
      
      On Nix's system BAR 2's read-only value is 0x6200.  Prior to 36e81648,
      we interpret that as a 512-byte BAR based on the lowest-order bit set.  Per
      datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
      avoid clearing any address bits if a platform uses only 64-byte alignment.
      
      [bhelgaas: changelog, reduce BAR 2 size to 64]
      Fixes: 36e81648 ("PCI: Restore detection of read-only BARs")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=85991#c4
      Link: http://support.amd.com/TechDocs/31506_cs5535_databook.pdf
      Link: http://support.amd.com/TechDocs/33238G_cs5536_db.pdfReported-and-tested-by: default avatarNix <nix@esperi.org.uk>
      Signed-off-by: default avatarMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a758de0e
    • Nadav Amit's avatar
      KVM: x86: SYSENTER emulation is broken · 038911f3
      Nadav Amit authored
      commit f3747379 upstream.
      
      SYSENTER emulation is broken in several ways:
      1. It misses the case of 16-bit code segments completely (CVE-2015-0239).
      2. MSR_IA32_SYSENTER_CS is checked in 64-bit mode incorrectly (bits 0 and 1 can
         still be set without causing #GP).
      3. MSR_IA32_SYSENTER_EIP and MSR_IA32_SYSENTER_ESP are not masked in
         legacy-mode.
      4. There is some unneeded code.
      
      Fix it.
      
      Cc: stable@vger.linux.org
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      038911f3
    • Avi Kivity's avatar
      KVM: x86 emulator: reject SYSENTER in compatibility mode on AMD guests · d5616c08
      Avi Kivity authored
      commit 1a18a69b upstream.
      
      If the guest thinks it's an AMD, it will not have prepared the SYSENTER MSRs,
      and if the guest executes SYSENTER in compatibility mode, it will fails.
      
      Detect this condition and #UD instead, like the spec says.
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d5616c08
    • Florian Westphal's avatar
      netfilter: conntrack: disable generic tracking for known protocols · d7cde286
      Florian Westphal authored
      commit db29a950 upstream.
      
      Given following iptables ruleset:
      
      -P FORWARD DROP
      -A FORWARD -m sctp --dport 9 -j ACCEPT
      -A FORWARD -p tcp --dport 80 -j ACCEPT
      -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT
      
      One would assume that this allows SCTP on port 9 and TCP on port 80.
      Unfortunately, if the SCTP conntrack module is not loaded, this allows
      *all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
      which we think is a security issue.
      
      This is because on the first SCTP packet on port 9, we create a dummy
      "generic l4" conntrack entry without any port information (since
      conntrack doesn't know how to extract this information).
      
      All subsequent packets that are unknown will then be in established
      state since they will fallback to proto_generic and will match the
      'generic' entry.
      
      Our originally proposed version [1] completely disabled generic protocol
      tracking, but Jozsef suggests to not track protocols for which a more
      suitable helper is available, hence we now mitigate the issue for in
      tree known ct protocol helpers only, so that at least NAT and direction
      information will still be preserved for others.
      
       [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html
      
      Joint work with Daniel Borkmann.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d7cde286
    • Ben Hutchings's avatar
      splice: Apply generic position and size checks to each write · 894c6350
      Ben Hutchings authored
      We need to check the position and size of file writes against various
      limits, using generic_write_check().  This was not being done for
      the splice write path.  It was fixed upstream by commit 8d020765
      ("->splice_write() via ->write_iter()") but we can't apply that.
      
      CVE-2014-7822
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      894c6350
    • Ben Hutchings's avatar
      vfs: Fix vfsmount_lock imbalance in path_init() · d8c8133e
      Ben Hutchings authored
      When backporting commit 4023bfc9 ("be careful with nd->inode in
      path_init() and follow_dotdot_rcu()"), I failed to account for the
      vfsmount_lock that is used in 3.2 but not upstream.  path_init() takes
      the lock if performing RCU lookup, but must drop it if (and only if)
      it subsequently fails.
      
      Reported-by: nuxi@vault24.org
      References: https://bugzilla.kernel.org/show_bug.cgi?id=92531Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Tested-by: nuxi@vault24.org
      d8c8133e
    • Jay Vosburgh's avatar
      net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding · 5fa7469e
      Jay Vosburgh authored
      [ Upstream commit 2c26d34b ]
      
      When using VXLAN tunnels and a sky2 device, I have experienced
      checksum failures of the following type:
      
      [ 4297.761899] eth0: hw csum failure
      [...]
      [ 4297.765223] Call Trace:
      [ 4297.765224]  <IRQ>  [<ffffffff8172f026>] dump_stack+0x46/0x58
      [ 4297.765235]  [<ffffffff8162ba52>] netdev_rx_csum_fault+0x42/0x50
      [ 4297.765238]  [<ffffffff8161c1a0>] ? skb_push+0x40/0x40
      [ 4297.765240]  [<ffffffff8162325c>] __skb_checksum_complete+0xbc/0xd0
      [ 4297.765243]  [<ffffffff8168c602>] tcp_v4_rcv+0x2e2/0x950
      [ 4297.765246]  [<ffffffff81666ca0>] ? ip_rcv_finish+0x360/0x360
      
      	These are reliably reproduced in a network topology of:
      
      container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch
      
      	When VXLAN encapsulated traffic is received from a similarly
      configured peer, the above warning is generated in the receive
      processing of the encapsulated packet.  Note that the warning is
      associated with the container eth0.
      
              The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
      because the packet is an encapsulated Ethernet frame, the checksum
      generated by the hardware includes the inner protocol and Ethernet
      headers.
      
      	The receive code is careful to update the skb->csum, except in
      __dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
      calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
      to skip over the Ethernet header, but does not update skb->csum when
      doing so.
      
      	This patch resolves the problem by adding a call to
      skb_postpull_rcsum to update the skb->csum after the call to
      eth_type_trans.
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5fa7469e
    • Govindarajulu Varadarajan's avatar
      enic: fix rx skb checksum · 58aa2f36
      Govindarajulu Varadarajan authored
      [ Upstream commit 17e96834 ]
      
      Hardware always provides compliment of IP pseudo checksum. Stack expects
      whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.
      
      This causes checksum error in nf & ovs.
      
      kernel: qg-19546f09-f2: hw csum failure
      kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF          O--------------   3.10.0-123.8.1.el7.x86_64 #1
      kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
      kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
      kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
      kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
      kernel: Call Trace:
      kernel: <IRQ>  [<ffffffff815e237b>] dump_stack+0x19/0x1b
      kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
      kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
      kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
      kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
      kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
      kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
      kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
      kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
      kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380
      
      Hardware verifies IP & tcp/udp header checksum but does not provide payload
      checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.
      
      Cc: Jiri Benc <jbenc@redhat.com>
      Cc: Stefan Assmann <sassmann@redhat.com>
      Reported-by: default avatarSunil Choudhary <schoudha@redhat.com>
      Signed-off-by: default avatarGovindarajulu Varadarajan <_govind@gmx.com>
      Reviewed-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      58aa2f36
    • Prashant Sreedharan's avatar
      tg3: tg3_disable_ints using uninitialized mailbox value to disable interrupts · 1da9db7a
      Prashant Sreedharan authored
      [ Upstream commit 05b0aa57 ]
      
      During driver load in tg3_init_one, if the driver detects DMA activity before
      intializing the chip tg3_halt is called. As part of tg3_halt interrupts are
      disabled using routine tg3_disable_ints. This routine was using mailbox value
      which was not initialized (default value is 0). As a result driver was writing
      0x00000001 to pci config space register 0, which is the vendor id / device id.
      
      This driver bug was exposed because of the commit a7877b17a667 (PCI: Check only
      the Vendor ID to identify Configuration Request Retry). Also this issue is only
      seen in older generation chipsets like 5722 because config space write to offset
      0 from driver is possible. The newer generation chips ignore writes to offset 0.
      Also without commit a7877b17a667, for these older chips when a GRC reset is
      issued the Bootcode would reprogram the vendor id/device id, which is the reason
      this bug was masked earlier.
      
      Fixed by initializing the interrupt mailbox registers before calling tg3_halt.
      
      Please queue for -stable.
      Reported-by: default avatarNils Holland <nholland@tisys.org>
      Reported-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarPrashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1da9db7a
    • Ben Hutchings's avatar
      dcache: Fix locking bugs in backported "deal with deadlock in d_walk()" · 20defcec
      Ben Hutchings authored
      Steven Rostedt reported:
      > Porting -rt to the latest 3.2 stable tree I triggered this bug:
      > 
      > =====================================
      > [ BUG: bad unlock balance detected! ]
      > -------------------------------------
      > rm/1638 is trying to release lock (rcu_read_lock) at:
      > [<c04fde6c>] rcu_read_unlock+0x0/0x23
      > but there are no more locks to release!
      > 
      > other info that might help us debug this:
      > 2 locks held by rm/1638:
      >  #0:  (&sb->s_type->i_mutex_key#9/1){+.+.+.}, at: [<c04f93eb>] do_rmdir+0x5f/0xd2
      >  #1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<c04f9329>] vfs_rmdir+0x49/0xac
      > 
      > stack backtrace:
      > Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
      > Call Trace:
      >  [<c083f390>] ? printk+0x1d/0x1f
      >  [<c0463cdf>] print_unlock_inbalance_bug+0xc3/0xcd
      >  [<c04653a8>] lock_release_non_nested+0x98/0x1ec
      >  [<c046228d>] ? trace_hardirqs_off_caller+0x18/0x90
      >  [<c0456f1c>] ? local_clock+0x2d/0x50
      >  [<c04fde6c>] ? d_hash+0x2f/0x2f
      >  [<c04fde6c>] ? d_hash+0x2f/0x2f
      >  [<c046568e>] lock_release+0x192/0x1ad
      >  [<c04fde83>] rcu_read_unlock+0x17/0x23
      >  [<c04ff344>] shrink_dcache_parent+0x227/0x270
      >  [<c04f9348>] vfs_rmdir+0x68/0xac
      >  [<c04f9424>] do_rmdir+0x98/0xd2
      >  [<c04f03ad>] ? fput+0x1a3/0x1ab
      >  [<c084dd42>] ? sysenter_exit+0xf/0x1a
      >  [<c0465b58>] ? trace_hardirqs_on_caller+0x118/0x149
      >  [<c04fa3e0>] sys_unlinkat+0x2b/0x35
      >  [<c084dd13>] sysenter_do_call+0x12/0x12
      > 
      > 
      > 
      > 
      > There's a path to calling rcu_read_unlock() without calling
      > rcu_read_lock() in have_submounts().
      > 
      > 	goto positive;
      > 
      > positive:
      > 	if (!locked && read_seqretry(&rename_lock, seq))
      > 		goto rename_retry;
      > 
      > rename_retry:
      > 	rcu_read_unlock();
      > 
      > in the above path, rcu_read_lock() is never done before calling
      > rcu_read_unlock();
      
      I reviewed locking contexts in all three functions that I changed when
      backporting "deal with deadlock in d_walk()".  It's actually worse
      than this:
      
      - We don't hold this_parent->d_lock at the 'positive' label in
        have_submounts(), but it is unlocked after 'rename_retry'.
      - There is an rcu_read_unlock() after the 'out' label in
        select_parent(), but it's not held at the 'goto out'.
      
      Fix all three lock imbalances.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Tested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      20defcec
    • Dan Carpenter's avatar
      netfilter: ipset: small potential read beyond the end of buffer · d64fba0d
      Dan Carpenter authored
      commit 2196937e upstream.
      
      We could be reading 8 bytes into a 4 byte buffer here.  It seems
      harmless but adding a check is the right thing to do and it silences a
      static checker warning.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d64fba0d
    • Sasha Levin's avatar
      KEYS: close race between key lookup and freeing · dc4a2f40
      Sasha Levin authored
      commit a3a87844 upstream.
      
      When a key is being garbage collected, it's key->user would get put before
      the ->destroy() callback is called, where the key is removed from it's
      respective tracking structures.
      
      This leaves a key hanging in a semi-invalid state which leaves a window open
      for a different task to try an access key->user. An example is
      find_keyring_by_name() which would dereference key->user for a key that is
      in the process of being garbage collected (where key->user was freed but
      ->destroy() wasn't called yet - so it's still present in the linked list).
      
      This would cause either a panic, or corrupt memory.
      
      Fixes CVE-2014-9529.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      [bwh: Backported to 3.2: adjust indentation]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      dc4a2f40
    • Jerry Hoemann's avatar
      fsnotify: next_i is freed during fsnotify_unmount_inodes. · 38edb97e
      Jerry Hoemann authored
      commit 6424babf upstream.
      
      During file system stress testing on 3.10 and 3.12 based kernels, the
      umount command occasionally hung in fsnotify_unmount_inodes in the
      section of code:
      
                      spin_lock(&inode->i_lock);
                      if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
                              spin_unlock(&inode->i_lock);
                              continue;
                      }
      
      As this section of code holds the global inode_sb_list_lock, eventually
      the system hangs trying to acquire the lock.
      
      Multiple crash dumps showed:
      
      The inode->i_state == 0x60 and i_count == 0 and i_sb_list would point
      back at itself.  As this is not the value of list upon entry to the
      function, the kernel never exits the loop.
      
      To help narrow down problem, the call to list_del_init in
      inode_sb_list_del was changed to list_del.  This poisons the pointers in
      the i_sb_list and causes a kernel to panic if it transverse a freed
      inode.
      
      Subsequent stress testing paniced in fsnotify_unmount_inodes at the
      bottom of the list_for_each_entry_safe loop showing next_i had become
      free.
      
      We believe the root cause of the problem is that next_i is being freed
      during the window of time that the list_for_each_entry_safe loop
      temporarily releases inode_sb_list_lock to call fsnotify and
      fsnotify_inode_delete.
      
      The code in fsnotify_unmount_inodes attempts to prevent the freeing of
      inode and next_i by calling __iget.  However, the code doesn't do the
      __iget call on next_i
      
      	if i_count == 0 or
      	if i_state & (I_FREEING | I_WILL_FREE)
      
      The patch addresses this issue by advancing next_i in the above two cases
      until we either find a next_i which we can __iget or we reach the end of
      the list.  This makes the handling of next_i more closely match the
      handling of the variable "inode."
      
      The time to reproduce the hang is highly variable (from hours to days.) We
      ran the stress test on a 3.10 kernel with the proposed patch for a week
      without failure.
      
      During list_for_each_entry_safe, next_i is becoming free causing
      the loop to never terminate.  Advance next_i in those cases where
      __iget is not done.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hp.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ken Helias <kenhelias@firemail.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jan Kara <jack@suse.cz>
      38edb97e
    • Borislav Petkov's avatar
      x86, cpu, amd: Add workaround for family 16h, erratum 793 · 9ec2b315
      Borislav Petkov authored
      commit 3b564968 upstream.
      
      This adds the workaround for erratum 793 as a precaution in case not
      every BIOS implements it.  This addresses CVE-2013-6885.
      
      Erratum text:
      
      [Revision Guide for AMD Family 16h Models 00h-0Fh Processors,
      document 51810 Rev. 3.04 November 2013]
      
      793 Specific Combination of Writes to Write Combined Memory Types and
      Locked Instructions May Cause Core Hang
      
      Description
      
      Under a highly specific and detailed set of internal timing
      conditions, a locked instruction may trigger a timing sequence whereby
      the write to a write combined memory type is not flushed, causing the
      locked instruction to stall indefinitely.
      
      Potential Effect on System
      
      Processor core hang.
      
      Suggested Workaround
      
      BIOS should set MSR
      C001_1020[15] = 1b.
      
      Fix Planned
      
      No fix planned
      
      [ hpa: updated description, fixed typo in MSR name ]
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/20140114230711.GS29865@pd.tnicTested-by: default avatarAravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      [bwh: Backported to 3.2:
       - Adjust filename
       - Venkatesh Srinivas pointed out we should use {rd,wr}msrl_safe() to
         avoid crashing on KVM.  This was fixed upstream by commit 8f86a737
         ("x86, AMD: Convert to the new bit access MSR accessors") but that's too
         much trouble to backport.  Here we must use {rd,wr}msrl_amd_safe().]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Moritz Muehlenhoff <jmm@debian.org>
      Cc: Venkatesh Srinivas <venkateshs@google.com>
      9ec2b315
    • Martin Schwidefsky's avatar
      s390/3215: fix tty output containing tabs · 308246b8
      Martin Schwidefsky authored
      commit e512d56c upstream.
      
      git commit 37f81fa1
      "n_tty: do O_ONLCR translation as a single write"
      surfaced a bug in the 3215 device driver. In combination this
      broke tab expansion for tty ouput.
      
      The cause is an asymmetry in the behaviour of tty3215_ops->write
      vs tty3215_ops->put_char. The put_char function scans for '\t'
      but the write function does not.
      
      As the driver has logic for the '\t' expansion remove XTABS
      from c_oflag of the initial termios as well.
      Reported-by: default avatarStephen Powell <zlinuxman@wowway.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      308246b8
    • Lv Zheng's avatar
      ACPI / EC: Fix regression due to conflicting firmware behavior between Samsung and Acer. · bb5850b1
      Lv Zheng authored
      commit 79149001 upstream.
      
      It is reported that Samsung laptops that need to poll events are broken by
      the following commit:
       Commit 3afcf2ec
       Subject: ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set
      
      The behaviors of the 2 vendor firmwares are conflict:
       1. Acer: OSPM shouldn't issue QR_EC unless SCI_EVT is set, firmware
               automatically sets SCI_EVT as long as there is event queued up.
       2. Samsung: OSPM should issue QR_EC whatever SCI_EVT is set, firmware
                  returns 0 when there is no event queued up.
      
      This patch is a quick fix to distinguish the behaviors to make Acer
      behavior only effective for Acer EC firmware so that the breakages on
      Samsung EC firmware can be avoided.
      
      Fixes: 3afcf2ec (ACPI / EC: Add support to disallow QR_EC to be issued ...)
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161Reported-and-tested-by: default avatarOrtwin Glück <odi@odi.ch>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      [ rjw : Subject ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Kamal Mostafa <kamal@canonical.com>
      bb5850b1
    • Ben Hutchings's avatar
      Revert "x86, 64bit, mm: Mark data/bss/brk to nx" · 309adf3c
      Ben Hutchings authored
      This reverts commit e105c818 which
      was commit 72212675 upstream.
      
      This caused suspend/resume to stop working on at least some systems -
      specifically, the system would reboot when woken.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      309adf3c
    • Ben Hutchings's avatar
      Revert "x86, mm: Set NX across entire PMD at boot" · dc705052
      Ben Hutchings authored
      This reverts commit a5c187d9 which
      was commit 45e2a9d4 upstream.
      
      The previous commit caused suspend/resume to stop working on at least
      some systems - specifically, the system would reboot when woken.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      dc705052
    • Linus Torvalds's avatar
      vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS · a8467a7a
      Linus Torvalds authored
      commit 9c145c56 upstream.
      
      The stack guard page error case has long incorrectly caused a SIGBUS
      rather than a SIGSEGV, but nobody actually noticed until commit
      fee7e49d ("mm: propagate error from stack expansion even for guard
      page") because that error case was never actually triggered in any
      normal situations.
      
      Now that we actually report the error, people noticed the wrong signal
      that resulted.  So far, only the test suite of libsigsegv seems to have
      actually cared, but there are real applications that use libsigsegv, so
      let's not wait for any of those to break.
      Reported-and-tested-by: default avatarTakashi Iwai <tiwai@suse.de>
      Tested-by: default avatarJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a8467a7a
    • Linus Torvalds's avatar
      vm: add VM_FAULT_SIGSEGV handling support · 219a047e
      Linus Torvalds authored
      commit 33692f27 upstream.
      
      The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
      "you should SIGSEGV" error, because the SIGSEGV case was generally
      handled by the caller - usually the architecture fault handler.
      
      That results in lots of duplication - all the architecture fault
      handlers end up doing very similar "look up vma, check permissions, do
      retries etc" - but it generally works.  However, there are cases where
      the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
      
      In particular, when accessing the stack guard page, libsigsegv expects a
      SIGSEGV.  And it usually got one, because the stack growth is handled by
      that duplicated architecture fault handler.
      
      However, when the generic VM layer started propagating the error return
      from the stack expansion in commit fee7e49d ("mm: propagate error
      from stack expansion even for guard page"), that now exposed the
      existing VM_FAULT_SIGBUS result to user space.  And user space really
      expected SIGSEGV, not SIGBUS.
      
      To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
      duplicate architecture fault handlers about it.  They all already have
      the code to handle SIGSEGV, so it's about just tying that new return
      value to the existing code, but it's all a bit annoying.
      
      This is the mindless minimal patch to do this.  A more extensive patch
      would be to try to gather up the mostly shared fault handling logic into
      one generic helper routine, and long-term we really should do that
      cleanup.
      
      Just from this patch, you can generally see that most architectures just
      copied (directly or indirectly) the old x86 way of doing things, but in
      the meantime that original x86 model has been improved to hold the VM
      semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
      "newer" things, so it would be a good idea to bring all those
      improvements to the generic case and teach other architectures about
      them too.
      Reported-and-tested-by: default avatarTakashi Iwai <tiwai@suse.de>
      Tested-by: default avatarJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust filenames, context
       - Drop arc, metag, nios2 and lustre changes
       - For sh, patch both 32-bit and 64-bit implementations to use goto bad_area
       - For s390, pass int_code and trans_exc_code as arguments to do_no_context()
         and do_sigsegv()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      219a047e