1. 21 Jan, 2015 7 commits
  2. 14 Jan, 2015 5 commits
  3. 13 Jan, 2015 13 commits
    • Andy Lutomirski's avatar
      x86_64, vdso: Fix the vdso address randomization algorithm · ff845a15
      Andy Lutomirski authored
      commit 394f56fe upstream.
      
      The theory behind vdso randomization is that it's mapped at a random
      offset above the top of the stack.  To avoid wasting a page of
      memory for an extra page table, the vdso isn't supposed to extend
      past the lowest PMD into which it can fit.  Other than that, the
      address should be a uniformly distributed address that meets all of
      the alignment requirements.
      
      The current algorithm is buggy: the vdso has about a 50% probability
      of being at the very end of a PMD.  The current algorithm also has a
      decent chance of failing outright due to incorrect handling of the
      case where the top of the stack is near the top of its PMD.
      
      This fixes the implementation.  The paxtest estimate of vdso
      "randomisation" improves from 11 bits to 18 bits.  (Disclaimer: I
      don't know what the paxtest code is actually calculating.)
      
      It's worth noting that this algorithm is inherently biased: the vdso
      is more likely to end up near the end of its PMD than near the
      beginning.  Ideally we would either nix the PMD sharing requirement
      or jointly randomize the vdso and the stack to reduce the bias.
      
      In the mean time, this is a considerable improvement with basically
      no risk of compatibility issues, since the allowed outputs of the
      algorithm are unchanged.
      
      As an easy test, doing this:
      
      for i in `seq 10000`
        do grep -P vdso /proc/self/maps |cut -d- -f1
      done |sort |uniq -d
      
      used to produce lots of output (1445 lines on my most recent run).
      A tiny subset looks like this:
      
      7fffdfffe000
      7fffe01fe000
      7fffe05fe000
      7fffe07fe000
      7fffe09fe000
      7fffe0bfe000
      7fffe0dfe000
      
      Note the suspicious fe000 endings.  With the fix, I get a much more
      palatable 76 repeated addresses.
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ff845a15
    • Jan Kara's avatar
      isofs: Fix unchecked printing of ER records · 24c7fcc3
      Jan Kara authored
      commit 4e202462 upstream.
      
      We didn't check length of rock ridge ER records before printing them.
      Thus corrupted isofs image can cause us to access and print some memory
      behind the buffer with obvious consequences.
      Reported-and-tested-by: default avatarCarl Henrik Lunde <chlunde@ping.uio.no>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      24c7fcc3
    • Sasha Levin's avatar
      KEYS: close race between key lookup and freeing · 55036ae4
      Sasha Levin authored
      commit a3a87844 upstream.
      
      When a key is being garbage collected, it's key->user would get put before
      the ->destroy() callback is called, where the key is removed from it's
      respective tracking structures.
      
      This leaves a key hanging in a semi-invalid state which leaves a window open
      for a different task to try an access key->user. An example is
      find_keyring_by_name() which would dereference key->user for a key that is
      in the process of being garbage collected (where key->user was freed but
      ->destroy() wasn't called yet - so it's still present in the linked list).
      
      This would cause either a panic, or corrupt memory.
      
      Fixes CVE-2014-9529.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      55036ae4
    • Sven Eckelmann's avatar
      batman-adv: Calculate extra tail size based on queued fragments · e6e75eaa
      Sven Eckelmann authored
      commit 5b6698b0 upstream.
      
      The fragmentation code was replaced in 610bfc6b
      ("batman-adv: Receive fragmented packets and merge"). The new code provided a
      mostly unused parameter skb for the merging function. It is used inside the
      function to calculate the additionally needed skb tailroom. But instead of
      increasing its own tailroom, it is only increasing the tailroom of the first
      queued skb. This is not correct in some situations because the first queued
      entry can be a different one than the parameter.
      
      An observed problem was:
      
      1. packet with size 104, total_size 1464, fragno 1 was received
         - packet is queued
      2. packet with size 1400, total_size 1464, fragno 0 was received
         - packet is queued at the end of the list
      3. enough data was received and can be given to the merge function
         (1464 == (1400 - 20) + (104 - 20))
         - merge functions gets 1400 byte large packet as skb argument
      4. merge function gets first entry in queue (104 byte)
         - stored as skb_out
      5. merge function calculates the required extra tail as total_size - skb->len
         - pskb_expand_head tail of skb_out with 64 bytes
      6. merge function tries to squeeze the extra 1380 bytes from the second queued
         skb (1400 byte aka skb parameter) in the 64 extra tail bytes of skb_out
      
      Instead calculate the extra required tail bytes for skb_out also using skb_out
      instead of using the parameter skb. The skb parameter is only used to get the
      total_size from the last received packet. This is also the total_size used to
      decide that all fragments were received.
      Reported-by: default avatarPhilipp Psurek <philipp.psurek@gmail.com>
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Acked-by: default avatarMartin Hundebøll <martin@hundeboll.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e6e75eaa
    • Jan Kara's avatar
      isofs: Fix infinite looping over CE entries · f5034d91
      Jan Kara authored
      commit f54e18f1 upstream.
      
      Rock Ridge extensions define so called Continuation Entries (CE) which
      define where is further space with Rock Ridge data. Corrupted isofs
      image can contain arbitrarily long chain of these, including a one
      containing loop and thus causing kernel to end in an infinite loop when
      traversing these entries.
      
      Limit the traversal to 32 entries which should be more than enough space
      to store all the Rock Ridge data.
      Reported-by: default avatarP J P <ppandit@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f5034d91
    • Andy Lutomirski's avatar
      x86_64, switch_to(): Load TLS descriptors before switching DS and ES · 39f1a2d2
      Andy Lutomirski authored
      commit f647d7c1 upstream.
      
      Otherwise, if buggy user code points DS or ES into the TLS
      array, they would be corrupted after a context switch.
      
      This also significantly improves the comments and documents some
      gotchas in the code.
      
      Before this patch, the both tests below failed.  With this
      patch, the es test passes, although the gsbase test still fails.
      
       ----- begin es test -----
      
      /*
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned short GDT3(int idx)
      {
      	return (idx << 3) | 3;
      }
      
      static int create_tls(int idx, unsigned int base)
      {
      	struct user_desc desc = {
      		.entry_number    = idx,
      		.base_addr       = base,
      		.limit           = 0xfffff,
      		.seg_32bit       = 1,
      		.contents        = 0, /* Data, grow-up */
      		.read_exec_only  = 0,
      		.limit_in_pages  = 1,
      		.seg_not_present = 0,
      		.useable         = 0,
      	};
      
      	if (syscall(SYS_set_thread_area, &desc) != 0)
      		err(1, "set_thread_area");
      
      	return desc.entry_number;
      }
      
      int main()
      {
      	int idx = create_tls(-1, 0);
      	printf("Allocated GDT index %d\n", idx);
      
      	unsigned short orig_es;
      	asm volatile ("mov %%es,%0" : "=rm" (orig_es));
      
      	int errors = 0;
      	int total = 1000;
      	for (int i = 0; i < total; i++) {
      		asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
      		usleep(100);
      
      		unsigned short es;
      		asm volatile ("mov %%es,%0" : "=rm" (es));
      		asm volatile ("mov %0,%%es" : : "rm" (orig_es));
      		if (es != GDT3(idx)) {
      			if (errors == 0)
      				printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
      				       GDT3(idx), es);
      			errors++;
      		}
      	}
      
      	if (errors) {
      		printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
      		return 1;
      	} else {
      		printf("[OK]\tES was preserved\n");
      		return 0;
      	}
      }
      
       ----- end es test -----
      
       ----- begin gsbase test -----
      
      /*
       * gsbase.c, a gsbase test
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned char *testptr, *testptr2;
      
      static unsigned char read_gs_testvals(void)
      {
      	unsigned char ret;
      	asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
      	return ret;
      }
      
      int main()
      {
      	int errors = 0;
      
      	testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr == MAP_FAILED)
      		err(1, "mmap");
      
      	testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr2 == MAP_FAILED)
      		err(1, "mmap");
      
      	*testptr = 0;
      	*testptr2 = 1;
      
      	if (syscall(SYS_arch_prctl, ARCH_SET_GS,
      		    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
      		err(1, "ARCH_SET_GS");
      
      	usleep(100);
      
      	if (read_gs_testvals() == 1) {
      		printf("[OK]\tARCH_SET_GS worked\n");
      	} else {
      		printf("[FAIL]\tARCH_SET_GS failed\n");
      		errors++;
      	}
      
      	asm volatile ("mov %0,%%gs" : : "r" (0));
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tWriting 0 to gs worked\n");
      	} else {
      		printf("[FAIL]\tWriting 0 to gs failed\n");
      		errors++;
      	}
      
      	usleep(100);
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tgsbase is still zero\n");
      	} else {
      		printf("[FAIL]\tgsbase was corrupted\n");
      		errors++;
      	}
      
      	return errors == 0 ? 0 : 1;
      }
      
       ----- end gsbase test -----
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      39f1a2d2
    • Eric W. Biederman's avatar
      userns: Only allow the creator of the userns unprivileged mappings · be2aec30
      Eric W. Biederman authored
      commit f95d7918 upstream.
      
      If you did not create the user namespace and are allowed
      to write to uid_map or gid_map you should already have the necessary
      privilege in the parent user namespace to establish any mapping
      you want so this will not affect userspace in practice.
      
      Limiting unprivileged uid mapping establishment to the creator of the
      user namespace makes it easier to verify all credentials obtained with
      the uid mapping can be obtained without the uid mapping without
      privilege.
      
      Limiting unprivileged gid mapping establishment (which is temporarily
      absent) to the creator of the user namespace also ensures that the
      combination of uid and gid can already be obtained without privilege.
      
      This is part of the fix for CVE-2014-8989.
      Reviewed-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      be2aec30
    • Eric W. Biederman's avatar
      userns: Document what the invariant required for safe unprivileged mappings. · be9edd8b
      Eric W. Biederman authored
      commit 0542f17b upstream.
      
      The rule is simple.  Don't allow anything that wouldn't be allowed
      without unprivileged mappings.
      
      It was previously overlooked that establishing gid mappings would
      allow dropping groups and potentially gaining permission to files and
      directories that had lesser permissions for a specific group than for
      all other users.
      
      This is the rule needed to fix CVE-2014-8989 and prevent any other
      security issues with new_idmap_permitted.
      
      The reason for this rule is that the unix permission model is old and
      there are programs out there somewhere that take advantage of every
      little corner of it.  So allowing a uid or gid mapping to be
      established without privielge that would allow anything that would not
      be allowed without that mapping will result in expectations from some
      code somewhere being violated.  Violated expectations about the
      behavior of the OS is a long way to say a security issue.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      be9edd8b
    • Eric W. Biederman's avatar
      userns: Check euid no fsuid when establishing an unprivileged uid mapping · 65007036
      Eric W. Biederman authored
      commit 80dd00a2 upstream.
      
      setresuid allows the euid to be set to any of uid, euid, suid, and
      fsuid.  Therefor it is safe to allow an unprivileged user to map
      their euid and use CAP_SETUID privileged with exactly that uid,
      as no new credentials can be obtained.
      
      I can not find a combination of existing system calls that allows setting
      uid, euid, suid, and fsuid from the fsuid making the previous use
      of fsuid for allowing unprivileged mappings a bug.
      
      This is part of a fix for CVE-2014-8989.
      Reviewed-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      65007036
    • Andy Lutomirski's avatar
      x86/tls: Validate TLS entries to protect espfix · eb8b9652
      Andy Lutomirski authored
      commit 41bdc785 upstream.
      
      Installing a 16-bit RW data segment into the GDT defeats espfix.
      AFAICT this will not affect glibc, Wine, or dosemu at all.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: security@kernel.org <security@kernel.org>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      eb8b9652
    • Nadav Amit's avatar
      [3.13-stable only] KVM: x86: Fix far-jump to non-canonical check · 111fefa3
      Nadav Amit authored
      commit 7e46dddd upstream.
      
      [3.13-stable's first backport (f9bffe04) of this commit accidentally omitted
      part of the upstream patch (the WARN_ON fixes), supplied here.]
      
      Commit d1442d85 ("KVM: x86: Handle errors when RIP is set during far
      jumps") introduced a bug that caused the fix to be incomplete.  Due to
      incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit
      segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may
      not trigger #GP.  As we know, this imposes a security problem.
      
      In addition, the condition for two warnings was incorrect.
      
      Fixes: d1442d85Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Vinson Lee <vlee@twopensource.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      111fefa3
    • Ronald Wahl's avatar
      usb: gadget: at91_udc: move prepare clk into process context · 0847cfb9
      Ronald Wahl authored
      commit b2ba27a5 upstream.
      
      Commit 76280832 (usb: gadget: at91_udc:
      prepare clk before calling enable) added clock preparation in interrupt
      context. This is not allowed as it might sleep. Also setting the clock
      rate is unsafe to call from there for the same reason. Move clock
      preparation and setting clock rate into process context (at91udc_probe).
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Acked-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Acked-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      [ kamal: backport to 3.13-stable: at91_udc.c moved ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0847cfb9
    • David Ertman's avatar
      e1000e: Fix no connectivity when driver loaded with cable out · b81e3d0a
      David Ertman authored
      commit b20a7744 upstream.
      
      In commit da1e2046, the flow for enabling/disabling an Si errata
      workaround (e1000_lv_jumbo_workaround_ich8lan) was changed to fix a problem
      with iAMT connections dropping on interface down with jumbo frames set.
      Part of this change was to move the function call disabling the workaround
      to e1000e_down() from the e1000_setup_rctl() function.  The mechanic for
      disabling of this workaround involves writing several MAC and PHY registers
      back to hardware defaults.
      
      After this commit, when the driver is loaded with the cable out, the PHY
      registers are not programmed with the correct default values.  This causes
      the device to be capable of transmitting packets, but is unable to recieve
      them until this workaround is called.
      
      The flow of e1000e's open code relies upon calling the above workaround to
      expicitly program these registers either with jumbo frame appropriate settings
      or h/w defaults on 82579 and newer hardware.
      
      Fix this issue by adding logic to e1000_setup_rctl() that not only calls
      e1000_lv_jumbo_workaround_ich8lan() when jumbo frames are set, to enable the
      workaround, but also calls this function to explicitly disable the workaround
      in the case that jumbo frames are not set.
      Signed-off-by: default avatarDave Ertman <davidx.m.ertman@intel.com>
      Tested-by: default avatarJeff Pieper <jeffrey.e.pieper@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Joseph Salisbury <joseph.salisbury@canonical.com>
      BugLink: http://bugs.launchpad.net/bugs/1400365Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b81e3d0a
  4. 09 Jan, 2015 1 commit
  5. 18 Dec, 2014 1 commit
  6. 15 Dec, 2014 11 commits
  7. 09 Dec, 2014 2 commits