1. 09 May, 2015 5 commits
  2. 06 Mar, 2015 24 commits
  3. 20 Feb, 2015 11 commits
    • Ben Hutchings's avatar
      Linux 3.2.67 · fd623507
      Ben Hutchings authored
      fd623507
    • Myron Stowe's avatar
      PCI: Handle read-only BARs on AMD CS553x devices · a758de0e
      Myron Stowe authored
      commit 06cf35f9 upstream.
      
      Some AMD CS553x devices have read-only BARs because of a firmware or
      hardware defect.  There's a workaround in quirk_cs5536_vsa(), but it no
      longer works after 36e81648 ("PCI: Restore detection of read-only
      BARs").  Prior to 36e81648, we filled in res->start; afterwards we
      leave it zeroed out.  The quirk only updated the size, so the driver tried
      to use a region starting at zero, which didn't work.
      
      Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
      hard-code the sizes.
      
      On Nix's system BAR 2's read-only value is 0x6200.  Prior to 36e81648,
      we interpret that as a 512-byte BAR based on the lowest-order bit set.  Per
      datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
      avoid clearing any address bits if a platform uses only 64-byte alignment.
      
      [bhelgaas: changelog, reduce BAR 2 size to 64]
      Fixes: 36e81648 ("PCI: Restore detection of read-only BARs")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=85991#c4
      Link: http://support.amd.com/TechDocs/31506_cs5535_databook.pdf
      Link: http://support.amd.com/TechDocs/33238G_cs5536_db.pdfReported-and-tested-by: default avatarNix <nix@esperi.org.uk>
      Signed-off-by: default avatarMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a758de0e
    • Nadav Amit's avatar
      KVM: x86: SYSENTER emulation is broken · 038911f3
      Nadav Amit authored
      commit f3747379 upstream.
      
      SYSENTER emulation is broken in several ways:
      1. It misses the case of 16-bit code segments completely (CVE-2015-0239).
      2. MSR_IA32_SYSENTER_CS is checked in 64-bit mode incorrectly (bits 0 and 1 can
         still be set without causing #GP).
      3. MSR_IA32_SYSENTER_EIP and MSR_IA32_SYSENTER_ESP are not masked in
         legacy-mode.
      4. There is some unneeded code.
      
      Fix it.
      
      Cc: stable@vger.linux.org
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      038911f3
    • Avi Kivity's avatar
      KVM: x86 emulator: reject SYSENTER in compatibility mode on AMD guests · d5616c08
      Avi Kivity authored
      commit 1a18a69b upstream.
      
      If the guest thinks it's an AMD, it will not have prepared the SYSENTER MSRs,
      and if the guest executes SYSENTER in compatibility mode, it will fails.
      
      Detect this condition and #UD instead, like the spec says.
      Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d5616c08
    • Florian Westphal's avatar
      netfilter: conntrack: disable generic tracking for known protocols · d7cde286
      Florian Westphal authored
      commit db29a950 upstream.
      
      Given following iptables ruleset:
      
      -P FORWARD DROP
      -A FORWARD -m sctp --dport 9 -j ACCEPT
      -A FORWARD -p tcp --dport 80 -j ACCEPT
      -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT
      
      One would assume that this allows SCTP on port 9 and TCP on port 80.
      Unfortunately, if the SCTP conntrack module is not loaded, this allows
      *all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
      which we think is a security issue.
      
      This is because on the first SCTP packet on port 9, we create a dummy
      "generic l4" conntrack entry without any port information (since
      conntrack doesn't know how to extract this information).
      
      All subsequent packets that are unknown will then be in established
      state since they will fallback to proto_generic and will match the
      'generic' entry.
      
      Our originally proposed version [1] completely disabled generic protocol
      tracking, but Jozsef suggests to not track protocols for which a more
      suitable helper is available, hence we now mitigate the issue for in
      tree known ct protocol helpers only, so that at least NAT and direction
      information will still be preserved for others.
      
       [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html
      
      Joint work with Daniel Borkmann.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d7cde286
    • Ben Hutchings's avatar
      splice: Apply generic position and size checks to each write · 894c6350
      Ben Hutchings authored
      We need to check the position and size of file writes against various
      limits, using generic_write_check().  This was not being done for
      the splice write path.  It was fixed upstream by commit 8d020765
      ("->splice_write() via ->write_iter()") but we can't apply that.
      
      CVE-2014-7822
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      894c6350
    • Ben Hutchings's avatar
      vfs: Fix vfsmount_lock imbalance in path_init() · d8c8133e
      Ben Hutchings authored
      When backporting commit 4023bfc9 ("be careful with nd->inode in
      path_init() and follow_dotdot_rcu()"), I failed to account for the
      vfsmount_lock that is used in 3.2 but not upstream.  path_init() takes
      the lock if performing RCU lookup, but must drop it if (and only if)
      it subsequently fails.
      
      Reported-by: nuxi@vault24.org
      References: https://bugzilla.kernel.org/show_bug.cgi?id=92531Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Tested-by: nuxi@vault24.org
      d8c8133e
    • Jay Vosburgh's avatar
      net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding · 5fa7469e
      Jay Vosburgh authored
      [ Upstream commit 2c26d34b ]
      
      When using VXLAN tunnels and a sky2 device, I have experienced
      checksum failures of the following type:
      
      [ 4297.761899] eth0: hw csum failure
      [...]
      [ 4297.765223] Call Trace:
      [ 4297.765224]  <IRQ>  [<ffffffff8172f026>] dump_stack+0x46/0x58
      [ 4297.765235]  [<ffffffff8162ba52>] netdev_rx_csum_fault+0x42/0x50
      [ 4297.765238]  [<ffffffff8161c1a0>] ? skb_push+0x40/0x40
      [ 4297.765240]  [<ffffffff8162325c>] __skb_checksum_complete+0xbc/0xd0
      [ 4297.765243]  [<ffffffff8168c602>] tcp_v4_rcv+0x2e2/0x950
      [ 4297.765246]  [<ffffffff81666ca0>] ? ip_rcv_finish+0x360/0x360
      
      	These are reliably reproduced in a network topology of:
      
      container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch
      
      	When VXLAN encapsulated traffic is received from a similarly
      configured peer, the above warning is generated in the receive
      processing of the encapsulated packet.  Note that the warning is
      associated with the container eth0.
      
              The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
      because the packet is an encapsulated Ethernet frame, the checksum
      generated by the hardware includes the inner protocol and Ethernet
      headers.
      
      	The receive code is careful to update the skb->csum, except in
      __dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
      calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
      to skip over the Ethernet header, but does not update skb->csum when
      doing so.
      
      	This patch resolves the problem by adding a call to
      skb_postpull_rcsum to update the skb->csum after the call to
      eth_type_trans.
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5fa7469e
    • Govindarajulu Varadarajan's avatar
      enic: fix rx skb checksum · 58aa2f36
      Govindarajulu Varadarajan authored
      [ Upstream commit 17e96834 ]
      
      Hardware always provides compliment of IP pseudo checksum. Stack expects
      whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.
      
      This causes checksum error in nf & ovs.
      
      kernel: qg-19546f09-f2: hw csum failure
      kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF          O--------------   3.10.0-123.8.1.el7.x86_64 #1
      kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
      kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
      kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
      kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
      kernel: Call Trace:
      kernel: <IRQ>  [<ffffffff815e237b>] dump_stack+0x19/0x1b
      kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
      kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
      kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
      kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
      kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
      kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
      kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
      kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
      kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
      kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
      kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380
      
      Hardware verifies IP & tcp/udp header checksum but does not provide payload
      checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.
      
      Cc: Jiri Benc <jbenc@redhat.com>
      Cc: Stefan Assmann <sassmann@redhat.com>
      Reported-by: default avatarSunil Choudhary <schoudha@redhat.com>
      Signed-off-by: default avatarGovindarajulu Varadarajan <_govind@gmx.com>
      Reviewed-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      58aa2f36
    • Prashant Sreedharan's avatar
      tg3: tg3_disable_ints using uninitialized mailbox value to disable interrupts · 1da9db7a
      Prashant Sreedharan authored
      [ Upstream commit 05b0aa57 ]
      
      During driver load in tg3_init_one, if the driver detects DMA activity before
      intializing the chip tg3_halt is called. As part of tg3_halt interrupts are
      disabled using routine tg3_disable_ints. This routine was using mailbox value
      which was not initialized (default value is 0). As a result driver was writing
      0x00000001 to pci config space register 0, which is the vendor id / device id.
      
      This driver bug was exposed because of the commit a7877b17a667 (PCI: Check only
      the Vendor ID to identify Configuration Request Retry). Also this issue is only
      seen in older generation chipsets like 5722 because config space write to offset
      0 from driver is possible. The newer generation chips ignore writes to offset 0.
      Also without commit a7877b17a667, for these older chips when a GRC reset is
      issued the Bootcode would reprogram the vendor id/device id, which is the reason
      this bug was masked earlier.
      
      Fixed by initializing the interrupt mailbox registers before calling tg3_halt.
      
      Please queue for -stable.
      Reported-by: default avatarNils Holland <nholland@tisys.org>
      Reported-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarPrashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1da9db7a
    • Ben Hutchings's avatar
      dcache: Fix locking bugs in backported "deal with deadlock in d_walk()" · 20defcec
      Ben Hutchings authored
      Steven Rostedt reported:
      > Porting -rt to the latest 3.2 stable tree I triggered this bug:
      > 
      > =====================================
      > [ BUG: bad unlock balance detected! ]
      > -------------------------------------
      > rm/1638 is trying to release lock (rcu_read_lock) at:
      > [<c04fde6c>] rcu_read_unlock+0x0/0x23
      > but there are no more locks to release!
      > 
      > other info that might help us debug this:
      > 2 locks held by rm/1638:
      >  #0:  (&sb->s_type->i_mutex_key#9/1){+.+.+.}, at: [<c04f93eb>] do_rmdir+0x5f/0xd2
      >  #1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<c04f9329>] vfs_rmdir+0x49/0xac
      > 
      > stack backtrace:
      > Pid: 1638, comm: rm Not tainted 3.2.66-test-rt96+ #2
      > Call Trace:
      >  [<c083f390>] ? printk+0x1d/0x1f
      >  [<c0463cdf>] print_unlock_inbalance_bug+0xc3/0xcd
      >  [<c04653a8>] lock_release_non_nested+0x98/0x1ec
      >  [<c046228d>] ? trace_hardirqs_off_caller+0x18/0x90
      >  [<c0456f1c>] ? local_clock+0x2d/0x50
      >  [<c04fde6c>] ? d_hash+0x2f/0x2f
      >  [<c04fde6c>] ? d_hash+0x2f/0x2f
      >  [<c046568e>] lock_release+0x192/0x1ad
      >  [<c04fde83>] rcu_read_unlock+0x17/0x23
      >  [<c04ff344>] shrink_dcache_parent+0x227/0x270
      >  [<c04f9348>] vfs_rmdir+0x68/0xac
      >  [<c04f9424>] do_rmdir+0x98/0xd2
      >  [<c04f03ad>] ? fput+0x1a3/0x1ab
      >  [<c084dd42>] ? sysenter_exit+0xf/0x1a
      >  [<c0465b58>] ? trace_hardirqs_on_caller+0x118/0x149
      >  [<c04fa3e0>] sys_unlinkat+0x2b/0x35
      >  [<c084dd13>] sysenter_do_call+0x12/0x12
      > 
      > 
      > 
      > 
      > There's a path to calling rcu_read_unlock() without calling
      > rcu_read_lock() in have_submounts().
      > 
      > 	goto positive;
      > 
      > positive:
      > 	if (!locked && read_seqretry(&rename_lock, seq))
      > 		goto rename_retry;
      > 
      > rename_retry:
      > 	rcu_read_unlock();
      > 
      > in the above path, rcu_read_lock() is never done before calling
      > rcu_read_unlock();
      
      I reviewed locking contexts in all three functions that I changed when
      backporting "deal with deadlock in d_walk()".  It's actually worse
      than this:
      
      - We don't hold this_parent->d_lock at the 'positive' label in
        have_submounts(), but it is unlocked after 'rename_retry'.
      - There is an rcu_read_unlock() after the 'out' label in
        select_parent(), but it's not held at the 'goto out'.
      
      Fix all three lock imbalances.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Tested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      20defcec