An error occurred fetching the project authors.
  1. 20 Aug, 2010 1 commit
  2. 11 Aug, 2010 2 commits
    • Oleg Nesterov's avatar
      pids: alloc_pidmap: remove the unnecessary boundary checks · c52b0b91
      Oleg Nesterov authored
      alloc_pidmap() calculates max_scan so that if the initial offset != 0 we
      inspect the first map->page twice.  This is correct, we want to find the
      unused bits < offset in this bitmap block.  Add the comment.
      
      But it doesn't make any sense to stop the find_next_offset() loop when we
      are looking into this map->page for the second time.  We have already
      already checked the bits >= offset during the first attempt, it is fine to
      do this again, no matter if we succeed this time or not.
      
      Remove this hard-to-understand code.  It optimizes the very unlikely case
      when we are going to fail, but slows down the more likely case.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Salman Qazi <sqazi@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c52b0b91
    • Salman's avatar
      pids: fix a race in pid generation that causes pids to be reused immediately · 5fdee8c4
      Salman authored
      A program that repeatedly forks and waits is susceptible to having the
      same pid repeated, especially when it competes with another instance of
      the same program.  This is really bad for bash implementation.
      Furthermore, many shell scripts assume that pid numbers will not be used
      for some length of time.
      
      Race Description:
      
      A                                    B
      
      // pid == offset == n                // pid == offset == n + 1
      test_and_set_bit(offset, map->page)
                                           test_and_set_bit(offset, map->page);
                                           pid_ns->last_pid = pid;
      pid_ns->last_pid = pid;
                                           // pid == n + 1 is freed (wait())
      
                                           // Next fork()...
                                           last = pid_ns->last_pid; // == n
                                           pid = last + 1;
      
      Code to reproduce it (Running multiple instances is more effective):
      
      #include <errno.h>
      #include <sys/types.h>
      #include <sys/wait.h>
      #include <unistd.h>
      #include <stdio.h>
      #include <stdlib.h>
      
      // The distance mod 32768 between two pids, where the first pid is expected
      // to be smaller than the second.
      int PidDistance(pid_t first, pid_t second) {
        return (second + 32768 - first) % 32768;
      }
      
      int main(int argc, char* argv[]) {
        int failed = 0;
        pid_t last_pid = 0;
        int i;
        printf("%d\n", sizeof(pid_t));
        for (i = 0; i < 10000000; ++i) {
          if (i % 32786 == 0)
            printf("Iter: %d\n", i/32768);
          int child_exit_code = i % 256;
          pid_t pid = fork();
          if (pid == -1) {
            fprintf(stderr, "fork failed, iteration %d, errno=%d", i, errno);
            exit(1);
          }
          if (pid == 0) {
            // Child
            exit(child_exit_code);
          } else {
            // Parent
            if (i > 0) {
              int distance = PidDistance(last_pid, pid);
              if (distance == 0 || distance > 30000) {
                fprintf(stderr,
                        "Unexpected pid sequence: previous fork: pid=%d, "
                        "current fork: pid=%d for iteration=%d.\n",
                        last_pid, pid, i);
                failed = 1;
              }
            }
            last_pid = pid;
            int status;
            int reaped = wait(&status);
            if (reaped != pid) {
              fprintf(stderr,
                      "Wait return value: expected pid=%d, "
                      "got %d, iteration %d\n",
                      pid, reaped, i);
              failed = 1;
            } else if (WEXITSTATUS(status) != child_exit_code) {
              fprintf(stderr,
                      "Unexpected exit status %x, iteration %d\n",
                      WEXITSTATUS(status), i);
              failed = 1;
            }
          }
        }
        exit(failed);
      }
      
      Thanks to Ted Tso for the key ideas of this implementation.
      Signed-off-by: default avatarSalman Qazi <sqazi@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5fdee8c4
  3. 27 May, 2010 1 commit
  4. 06 Mar, 2010 1 commit
  5. 04 Mar, 2010 1 commit
  6. 25 Feb, 2010 1 commit
    • Paul E. McKenney's avatar
      sched: Use lockdep-based checking on rcu_dereference() · d11c563d
      Paul E. McKenney authored
      Update the rcu_dereference() usages to take advantage of the new
      lockdep-based checking.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <1266887105-1528-6-git-send-email-paulmck@linux.vnet.ibm.com>
      [ -v2: fix allmodconfig missing symbol export build failure on x86 ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d11c563d
  7. 16 Dec, 2009 2 commits
  8. 22 Sep, 2009 1 commit
  9. 09 Jul, 2009 1 commit
  10. 29 Jun, 2009 1 commit
  11. 18 Jun, 2009 1 commit
    • Christoph Hellwig's avatar
      pids: clean up find_task_by_pid variants · 17f98dcf
      Christoph Hellwig authored
      find_task_by_pid_type_ns is only used to implement find_task_by_vpid and
      find_task_by_pid_ns, but both of them pass PIDTYPE_PID as first argument.
      So just fold find_task_by_pid_type_ns into find_task_by_pid_ns and use
      find_task_by_pid_ns to implement find_task_by_vpid.
      
      While we're at it also remove the exports for find_task_by_pid_ns and
      find_task_by_vpid - we don't have any modular callers left as the only
      modular caller of he old pre pid namespace find_task_by_pid (gfs2) was
      switched to pid_task which operates on a struct pid pointer instead of a
      pid_t.  Given the confusion about pid_t values vs namespace that's
      generally the better option anyway and I think we're better of restricting
      modules to do it that way.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      17f98dcf
  12. 03 Apr, 2009 2 commits
    • Oleg Nesterov's avatar
      pids: refactor vnr/nr_ns helpers to make them safe · 52ee2dfd
      Oleg Nesterov authored
      Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.
      
      task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.
      
      As for "special" pids, vnr/nr_ns helpers always need rcu.  However, if
      task != current, they are unsafe even under rcu lock, we can't trust
      task->group_leader without the special checks.
      
      And almost every helper has a callsite which needs a fix.
      
      Also, it is a bit annoying that the implementations of, say,
      task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".
      
      This patch introduces the new helper, __task_pid_nr_ns(), which is always
      safe to use, and turns all other helpers into the trivial wrappers.
      
      After this I'll send another patch which converts task_tgid_xxx() as well,
      they're are a bit special.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      52ee2dfd
    • Oleg Nesterov's avatar
      pids: improve get_task_pid() to fix the unsafe sys_wait4()->task_pgrp() · 2ae448ef
      Oleg Nesterov authored
      sys_wait4() does get_pid(task_pgrp(current)), this is not safe.  We can
      add rcu lock/unlock around, but we already have get_task_pid() which can
      be improved to handle the special pids in more reliable manner.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2ae448ef
  13. 08 Jan, 2009 1 commit
    • Eric W. Biederman's avatar
      pid: generalize task_active_pid_ns · 61bce0f1
      Eric W. Biederman authored
      Currently task_active_pid_ns is not safe to call after a task becomes a
      zombie and exit_task_namespaces is called, as nsproxy becomes NULL.  By
      reading the pid namespace from the pid of the task we can trivially solve
      this problem at the cost of one extra memory read in what should be the
      same cacheline as we read the namespace from.
      
      When moving things around I have made task_active_pid_ns out of line
      because keeping it in pid_namespace.h would require adding includes of
      pid.h and sched.h that I don't think we want.
      
      This change does make task_active_pid_ns unsafe to call during
      copy_process until we attach a pid on the task_struct which seems to be a
      reasonable trade off.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Bastian Blank <bastian@waldi.eu.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Acked-by: default avatarSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      61bce0f1
  14. 06 Jan, 2009 1 commit
  15. 25 Jul, 2008 2 commits
  16. 19 May, 2008 1 commit
  17. 30 Apr, 2008 4 commits
  18. 08 Feb, 2008 3 commits
  19. 07 Feb, 2008 1 commit
  20. 15 Nov, 2007 1 commit
    • Eric W. Biederman's avatar
      pidns: Place under CONFIG_EXPERIMENTAL · 57d5f66b
      Eric W. Biederman authored
      This is my trivial patch to swat innumerable little bugs with a single
      blow.
      
      After some intensive review (my apologies for not having gotten to this
      sooner) what we have looks like a good base to build on with the current
      pid namespace code but it is not complete, and it is still much to simple
      to find issues where the kernel does the wrong thing outside of the initial
      pid namespace.
      
      Until the dust settles and we are certain we have the ABI and the
      implementation is as correct as humanly possible let's keep process ID
      namespaces behind CONFIG_EXPERIMENTAL.
      
      Allowing us the option of fixing any ABI or other bugs we find as long as
      they are minor.
      
      Allowing users of the kernel to avoid those bugs simply by ensuring their
      kernel does not have support for multiple pid namespaces.
      
      [akpm@linux-foundation.org: coding-style cleanups]
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Adrian Bunk <bunk@kernel.org>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Kir Kolyshkin <kir@swsoft.com>
      Cc: Kirill Korotaev <dev@sw.ru>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57d5f66b
  21. 19 Oct, 2007 11 commits