1. 03 May, 2012 6 commits
  2. 26 Apr, 2012 2 commits
    • Eric W. Biederman's avatar
      userns: Rework the user_namespace adding uid/gid mapping support · 22d917d8
      Eric W. Biederman authored
      - Convert the old uid mapping functions into compatibility wrappers
      - Add a uid/gid mapping layer from user space uid and gids to kernel
        internal uids and gids that is extent based for simplicty and speed.
        * Working with number space after mapping uids/gids into their kernel
          internal version adds only mapping complexity over what we have today,
          leaving the kernel code easy to understand and test.
      - Add proc files /proc/self/uid_map /proc/self/gid_map
        These files display the mapping and allow a mapping to be added
        if a mapping does not exist.
      - Allow entering the user namespace without a uid or gid mapping.
        Since we are starting with an existing user our uids and gids
        still have global mappings so are still valid and useful they just don't
        have local mappings.  The requirement for things to work are global uid
        and gid so it is odd but perfectly fine not to have a local uid
        and gid mapping.
        Not requiring global uid and gid mappings greatly simplifies
        the logic of setting up the uid and gid mappings by allowing
        the mappings to be set after the namespace is created which makes the
        slight weirdness worth it.
      - Make the mappings in the initial user namespace to the global
        uid/gid space explicit.  Today it is an identity mapping
        but in the future we may want to twist this for debugging, similar
        to what we do with jiffies.
      - Document the memory ordering requirements of setting the uid and
        gid mappings.  We only allow the mappings to be set once
        and there are no pointers involved so the requirments are
        trivial but a little atypical.
      
      Performance:
      
      In this scheme for the permission checks the performance is expected to
      stay the same as the actuall machine instructions should remain the same.
      
      The worst case I could think of is ls -l on a large directory where
      all of the stat results need to be translated with from kuids and
      kgids to uids and gids.  So I benchmarked that case on my laptop
      with a dual core hyperthread Intel i5-2520M cpu with 3M of cpu cache.
      
      My benchmark consisted of going to single user mode where nothing else
      was running. On an ext4 filesystem opening 1,000,000 files and looping
      through all of the files 1000 times and calling fstat on the
      individuals files.  This was to ensure I was benchmarking stat times
      where the inodes were in the kernels cache, but the inode values were
      not in the processors cache.  My results:
      
      v3.4-rc1:         ~= 156ns (unmodified v3.4-rc1 with user namespace support disabled)
      v3.4-rc1-userns-: ~= 155ns (v3.4-rc1 with my user namespace patches and user namespace support disabled)
      v3.4-rc1-userns+: ~= 164ns (v3.4-rc1 with my user namespace patches and user namespace support enabled)
      
      All of the configurations ran in roughly 120ns when I performed tests
      that ran in the cpu cache.
      
      So in summary the performance impact is:
      1ns improvement in the worst case with user namespace support compiled out.
      8ns aka 5% slowdown in the worst case with user namespace support compiled in.
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      22d917d8
    • Eric W. Biederman's avatar
      userns: Simplify the user_namespace by making userns->creator a kuid. · 783291e6
      Eric W. Biederman authored
      - Transform userns->creator from a user_struct reference to a simple
        kuid_t, kgid_t pair.
      
        In cap_capable this allows the check to see if we are the creator of
        a namespace to become the classic suser style euid permission check.
      
        This allows us to remove the need for a struct cred in the mapping
        functions and still be able to dispaly the user namespace creators
        uid and gid as 0.
      
      - Remove the now unnecessary delayed_work in free_user_ns.
      
        All that is left for free_user_ns to do is to call kmem_cache_free
        and put_user_ns.  Those functions can be called in any context
        so call them directly from free_user_ns removing the need for delayed work.
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      783291e6
  3. 08 Apr, 2012 4 commits
  4. 07 Apr, 2012 9 commits
  5. 03 Apr, 2012 2 commits
  6. 31 Mar, 2012 17 commits
    • Linus Torvalds's avatar
      Linux 3.4-rc1 · dd775ae2
      Linus Torvalds authored
      dd775ae2
    • Linus Torvalds's avatar
      Merge branch 's3-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/amit/virtio-console · b7ffff4b
      Linus Torvalds authored
      Pull virtio S3 support patches from Amit Shah:
       "Turns out S3 is not different from S4 for virtio devices: the device
        is assumed to be reset, so the host and guest state are to be assumed
        to be out of sync upon resume.  We handle the S4 case with exactly the
        same scenario, so just point the suspend/resume routines to the
        freeze/restore ones.
      
        Once that is done, we also use the PM API's macro to initialise the
        sleep functions.
      
        A couple of cleanups are included: there's no need for special thaw
        processing in the balloon driver, so that's addressed in patches 1 and
        2.
      
        Testing: both S3 and S4 support have been tested using these patches
        using a similar method used earlier during S4 patch development: a
        guest is started with virtio-blk as the only disk, a virtio network
        card, a virtio-serial port and a virtio balloon device.  Ping from
        guest to host, dd /dev/zero to a file on the disk, and IO from the
        host on the virtio-serial port, all at once, while exercising S4 and
        S3 (separately) were tested.  They all continue to work fine after
        resume.  virtio balloon values too were tested by inflating and
        deflating the balloon."
      
      Pulling from Amit, since Rusty is off getting married (and presumably
      shaving people).
      
      * 's3-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/amit/virtio-console:
        virtio-pci: switch to PM ops macro to initialise PM functions
        virtio-pci: S3 support
        virtio-pci: drop restore_common()
        virtio: drop thaw PM operation
        virtio: balloon: Allow stats update after restore from S4
      b7ffff4b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 8bb1f229
      Linus Torvalds authored
      Pull second try at vfs part d#2 from Al Viro:
       "Miklos' first series (with do_lookup() rewrite split into edible
        chunks) + assorted bits and pieces.
      
        The 'untangling of do_lookup()' series is is a splitup of what used to
        be a monolithic patch from Miklos, so this series is basically "how do
        I convince myself that his patch is correct (or find a hole in it)".
        No holes found and I like the resulting cleanup, so in it went..."
      
      Changes from try 1: Fix a boot problem with selinux, and commit messages
      prettied up a bit.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits)
        vfs: fix out-of-date dentry_unhash() comment
        vfs: split __lookup_hash
        untangling do_lookup() - take __lookup_hash()-calling case out of line.
        untangling do_lookup() - switch to calling __lookup_hash()
        untangling do_lookup() - merge d_alloc_and_lookup() callers
        untangling do_lookup() - merge failure exits in !dentry case
        untangling do_lookup() - massage !dentry case towards __lookup_hash()
        untangling do_lookup() - get rid of need_reval in !dentry case
        untangling do_lookup() - eliminate a loop.
        untangling do_lookup() - expand the area under ->i_mutex
        untangling do_lookup() - isolate !dentry stuff from the rest of it.
        vfs: move MAY_EXEC check from __lookup_hash()
        vfs: don't revalidate just looked up dentry
        vfs: fix d_need_lookup/d_revalidate order in do_lookup
        ext3: move headers to fs/ext3/
        migrate ext2_fs.h guts to fs/ext2/ext2.h
        new helper: ext2_image_size()
        get rid of pointless includes of ext2_fs.h
        ext2: No longer export ext2_fs.h to user space
        mtdchar: kill persistently held vfsmount
        ...
      8bb1f229
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f22e08a7
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar.
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Fix incorrect usage of for_each_cpu_mask() in select_fallback_rq()
        sched: Fix __schedule_bug() output when called from an interrupt
        sched/arch: Introduce the finish_arch_post_lock_switch() scheduler callback
      f22e08a7
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f187e9fd
      Linus Torvalds authored
      Pull perf updates and fixes from Ingo Molnar:
       "It's mostly fixes, but there's also two late items:
      
         - preliminary GTK GUI support for perf report
         - PMU raw event format descriptors in sysfs, to be parsed by tooling
      
        The raw event format in sysfs is a new ABI.  For example for the 'CPU'
        PMU we have:
      
          aldebaran:~> ll /sys/bus/event_source/devices/cpu/format/*
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/any
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/cmask
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/edge
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/event
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/inv
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/offcore_rsp
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/pc
          -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/umask
      
        those lists of fields contain a specific format:
      
          aldebaran:~> cat /sys/bus/event_source/devices/cpu/format/offcore_rsp
          config1:0-63
      
        So, those who wish to specify raw events can now use the following
        event format:
      
          -e cpu/cmask=1,event=2,umask=3
      
        Most people will not want to specify any events (let alone raw
        events), they'll just use whatever default event the tools use.
      
        But for more obscure PMU events that have no cross-architecture
        generic events the above syntax is more usable and a bit more
        structured than specifying hex numbers."
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
        perf tools: Remove auto-generated bison/flex files
        perf annotate: Fix off by one symbol hist size allocation and hit accounting
        perf tools: Add missing ref-cycles event back to event parser
        perf annotate: addr2line wants addresses in same format as objdump
        perf probe: Finder fails to resolve function name to address
        tracing: Fix ent_size in trace output
        perf symbols: Handle NULL dso in dso__name_len
        perf symbols: Do not include libgen.h
        perf tools: Fix bug in raw sample parsing
        perf tools: Fix display of first level of callchains
        perf tools: Switch module.h into export.h
        perf: Move mmap page data_head offset assertion out of header
        perf: Fix mmap_page capabilities and docs
        perf diff: Fix to work with new hists design
        perf tools: Fix modifier to be applied on correct events
        perf tools: Fix various casting issues for 32 bits
        perf tools: Simplify event_read_id exit path
        tracing: Fix ftrace stack trace entries
        tracing: Move the tracing_on/off() declarations into CONFIG_TRACING
        perf report: Add a simple GTK2-based 'perf report' browser
        ...
      f187e9fd
    • Linus Torvalds's avatar
      Merge tag 'parisc-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 · adb3b1f3
      Linus Torvalds authored
      Pull PARISC misc updates from James Bottomley:
       "This is a couple of minor updates (fixing lws futex locking and
        removing some obsolete cpu_*_map calls)."
      
      * tag 'parisc-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
        [PARISC] remove references to cpu_*_map.
        [PARISC] futex: Use same lock set as lws calls
      adb3b1f3
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 · a75ee6ec
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       "This is primarily another round of driver updates (lpfc, bfa, fcoe,
        ipr) plus a new ufshcd driver.  There shouldn't be anything
        controversial in here (The final deletion of scsi proc_ops which
        caused some build breakage has been held over until the next merge
        window to give us more time to stabilise it).
      
        I'm afraid, with me moving continents at exactly the wrong time,
        anything submitted after the merge window opened has been held over to
        the next merge window."
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (63 commits)
        [SCSI] ipr: Driver version 2.5.3
        [SCSI] ipr: Increase alignment boundary of command blocks
        [SCSI] ipr: Increase max concurrent oustanding commands
        [SCSI] ipr: Remove unnecessary memory barriers
        [SCSI] ipr: Remove unnecessary interrupt clearing on new adapters
        [SCSI] ipr: Fix target id allocation re-use problem
        [SCSI] atp870u, mpt2sas, qla4xxx use pci_dev->revision
        [SCSI] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up
        [SCSI] bfa: Update the driver version to 3.0.23.0
        [SCSI] bfa: BSG and User interface fixes.
        [SCSI] bfa: Fix to avoid vport delete hang on request queue full scenario.
        [SCSI] bfa: Move service parameter programming logic into firmware.
        [SCSI] bfa: Revised Fabric Assigned Address(FAA) feature implementation.
        [SCSI] bfa: Flash controller IOC pll init fixes.
        [SCSI] bfa: Serialize the IOC hw semaphore unlock logic.
        [SCSI] bfa: Modify ISR to process pending completions
        [SCSI] bfa: Add fc host issue lip support
        [SCSI] mpt2sas: remove extraneous sas_log_info messages
        [SCSI] libfc: fcoe_transport_create fails in single-CPU environment
        [SCSI] fcoe: reduce contention for fcoe_rx_list lock [v2]
        ...
      a75ee6ec
    • J. Bruce Fields's avatar
      vfs: fix out-of-date dentry_unhash() comment · c0d02594
      J. Bruce Fields authored
      64252c75 "vfs: remove dget() from
      dentry_unhash()" changed the implementation but not the comment.
      
      Cc: Sage Weil <sage@newdream.net>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c0d02594
    • Miklos Szeredi's avatar
      vfs: split __lookup_hash · bad61189
      Miklos Szeredi authored
      Split __lookup_hash into two component functions:
      
       lookup_dcache - tries cached lookup, returns whether real lookup is needed
       lookup_real - calls i_op->lookup
      
      This eliminates code duplication between d_alloc_and_lookup() and
      d_inode_lookup().
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      bad61189
    • Al Viro's avatar
    • Al Viro's avatar
      untangling do_lookup() - switch to calling __lookup_hash() · a3255546
      Al Viro authored
      now we have __lookup_hash() open-coded if !dentry case;
      just call the damn thing instead...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a3255546
    • Al Viro's avatar
      a6ecdfcf
    • Al Viro's avatar
      ec335e91
    • Al Viro's avatar
      untangling do_lookup() - massage !dentry case towards __lookup_hash() · d774a058
      Al Viro authored
      Reorder if-else cases for starters...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d774a058
    • Al Viro's avatar
      untangling do_lookup() - get rid of need_reval in !dentry case · 08b0ab7c
      Al Viro authored
      Everything arriving into if (!dentry) will have need_reval = 1.
      Indeed, the only way to get there with need_reval reset to 0 would
      be via
      	if (unlikely(d_need_lookup(dentry)))
      		goto unlazy;
      	if (unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE)) {
      		status = d_revalidate(dentry, nd);
      	if (unlikely(status <= 0)) {
      		if (status != -ECHILD)
      			need_reval = 0;
      		goto unlazy;
      ...
      unlazy:
      	/* no assignments to dentry */
      	if (dentry && unlikely(d_need_lookup(dentry))) {
      		dput(dentry);
      		dentry = NULL;
      	}
      and if d_need_lookup() had already been false the first time around, it
      will remain false on the second call as well.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      08b0ab7c
    • Al Viro's avatar
      untangling do_lookup() - eliminate a loop. · acc9cb3c
      Al Viro authored
      d_lookup() *will* fail after successful d_invalidate(), if we are
      holding i_mutex all along.  IOW, we don't need to jump back to
      l: - we know what path will be taken there and can do that (i.e.
      d_alloc_and_lookup()) directly.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      acc9cb3c
    • Al Viro's avatar
      untangling do_lookup() - expand the area under ->i_mutex · 37c17e1f
      Al Viro authored
      keep holding ->i_mutex over revalidation parts
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      37c17e1f