1. 05 Jun, 2017 35 commits
  2. 04 Apr, 2017 5 commits
    • Ben Hutchings's avatar
      Linux 3.16.43 · 760b5f93
      Ben Hutchings authored
      760b5f93
    • Ben Hutchings's avatar
      keys: Guard against null match function in keyring_search_aux() · c53ee259
      Ben Hutchings authored
      The "dead" key type has no match operation, and a search for keys of
      this type can cause a null dereference in keyring_search_iterator().
      keyring_search() has a check for this, but request_keyring_and_link()
      does not.  Move the check into keyring_search_aux(), covering both of
      them.
      
      This was fixed upstream by commit c06cfb08 ("KEYS: Remove
      key_type::match in favour of overriding default by match_preparse"),
      part of a series of large changes that are not suitable for
      backporting.
      
      CVE-2017-2647 / CVE-2017-6951
      Reported-by: default avatarIgor Redko <redkoi@virtuozzo.com>
      Reported-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      References: https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2017-2647Reported-by: default avataridl3r <idler1984@gmail.com>
      References: https://www.spinics.net/lists/keyrings/msg01845.htmlSigned-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      c53ee259
    • Jann Horn's avatar
      aio: mark AIO pseudo-fs noexec · 880366a6
      Jann Horn authored
      commit 22f6b4d3 upstream.
      
      This ensures that do_mmap() won't implicitly make AIO memory mappings
      executable if the READ_IMPLIES_EXEC personality flag is set.  Such
      behavior is problematic because the security_mmap_file LSM hook doesn't
      catch this case, potentially permitting an attacker to bypass a W^X
      policy enforced by SELinux.
      
      I have tested the patch on my machine.
      
      To test the behavior, compile and run this:
      
          #define _GNU_SOURCE
          #include <unistd.h>
          #include <sys/personality.h>
          #include <linux/aio_abi.h>
          #include <err.h>
          #include <stdlib.h>
          #include <stdio.h>
          #include <sys/syscall.h>
      
          int main(void) {
              personality(READ_IMPLIES_EXEC);
              aio_context_t ctx = 0;
              if (syscall(__NR_io_setup, 1, &ctx))
                  err(1, "io_setup");
      
              char cmd[1000];
              sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'",
                  (int)getpid());
              system(cmd);
              return 0;
          }
      
      In the output, "rw-s" is good, "rwxs" is bad.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.16: we don't have super_block::s_iflags; use
       file_system_type::fs_flags instead]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      880366a6
    • Eric W. Biederman's avatar
      vfs: Commit to never having exectuables on proc and sysfs. · 495d1af4
      Eric W. Biederman authored
      commit 22f6b4d3 upstream.
      
      Today proc and sysfs do not contain any executable files.  Several
      applications today mount proc or sysfs without noexec and nosuid and
      then depend on there being no exectuables files on proc or sysfs.
      Having any executable files show on proc or sysfs would cause
      a user space visible regression, and most likely security problems.
      
      Therefore commit to never allowing executables on proc and sysfs by
      adding a new flag to mark them as filesystems without executables and
      enforce that flag.
      
      Test the flag where MNT_NOEXEC is tested today, so that the only user
      visible effect will be that exectuables will be treated as if the
      execute bit is cleared.
      
      The filesystems proc and sysfs do not currently incoporate any
      executable files so this does not result in any user visible effects.
      
      This makes it unnecessary to vet changes to proc and sysfs tightly for
      adding exectuable files or changes to chattr that would modify
      existing files, as no matter what the individual file say they will
      not be treated as exectuable files by the vfs.
      
      Not having to vet changes to closely is important as without this we
      are only one proc_create call (or another goof up in the
      implementation of notify_change) from having problematic executables
      on proc.  Those mistakes are all too easy to make and would create
      a situation where there are security issues or the assumptions of
      some program having to be broken (and cause userspace regressions).
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      [bwh: Backported to 3.16: we don't have super_block::s_iflags; use
       file_system_type::fs_flags instead]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      495d1af4
    • Florian Westphal's avatar
      netlink: remove mmapped netlink support · 07a365dd
      Florian Westphal authored
      commit d1b4c689 upstream.
      
      mmapped netlink has a number of unresolved issues:
      
      - TX zerocopy support had to be disabled more than a year ago via
        commit 4682a035 ("netlink: Always copy on mmap TX.")
        because the content of the mmapped area can change after netlink
        attribute validation but before message processing.
      
      - RX support was implemented mainly to speed up nfqueue dumping packet
        payload to userspace.  However, since commit ae08ce00
        ("netfilter: nfnetlink_queue: zero copy support") we avoid one copy
        with the socket-based interface too (via the skb_zerocopy helper).
      
      The other problem is that skbs attached to mmaped netlink socket
      behave different from normal skbs:
      
      - they don't have a shinfo area, so all functions that use skb_shinfo()
      (e.g. skb_clone) cannot be used.
      
      - reserving headroom prevents userspace from seeing the content as
      it expects message to start at skb->head.
      See for instance
      commit aa3a0220 ("netlink: not trim skb for mmaped socket when dump").
      
      - skbs handed e.g. to netlink_ack must have non-NULL skb->sk, else we
      crash because it needs the sk to check if a tx ring is attached.
      
      Also not obvious, leads to non-intuitive bug fixes such as 7c7bdf35
      ("netfilter: nfnetlink: use original skbuff when acking batches").
      
      mmaped netlink also didn't play nicely with the skb_zerocopy helper
      used by nfqueue and openvswitch.  Daniel Borkmann fixed this via
      commit 6bb0fef4 ("netlink, mmap: fix edge-case leakages in nf queue
      zero-copy")' but at the cost of also needing to provide remaining
      length to the allocation function.
      
      nfqueue also has problems when used with mmaped rx netlink:
      - mmaped netlink doesn't allow use of nfqueue batch verdict messages.
        Problem is that in the mmap case, the allocation time also determines
        the ordering in which the frame will be seen by userspace (A
        allocating before B means that A is located in earlier ring slot,
        but this also means that B might get a lower sequence number then A
        since seqno is decided later.  To fix this we would need to extend the
        spinlocked region to also cover the allocation and message setup which
        isn't desirable.
      - nfqueue can now be configured to queue large (GSO) skbs to userspace.
        Queing GSO packets is faster than having to force a software segmentation
        in the kernel, so this is a desirable option.  However, with a mmap based
        ring one has to use 64kb per ring slot element, else mmap has to fall back
        to the socket path (NL_MMAP_STATUS_COPY) for all large packets.
      
      To use the mmap interface, userspace not only has to probe for mmap netlink
      support, it also has to implement a recv/socket receive path in order to
      handle messages that exceed the size of an rx ring element.
      
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Ken-ichirou MATSUZAWA <chamaken@gmail.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Thomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: deleted code and documentation is different in places]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Shi Yuejie <shiyuejie@outlook.com>
      07a365dd