1. 31 Mar, 2009 6 commits
    • Li Zefan's avatar
      blktrace: fix blk_probes_ref chaos · 17ba97e3
      Li Zefan authored
      Impact: fix mixed ioctl and ftrace-plugin blktrace use refcount bugs
      
      ioctl-based blktrace allocates bt and registers tracepoints when
      ioctl(BLKTRACESETUP), and do all cleanups when ioctl(BLKTRACETEARDOWN).
      
      while ftrace-based blktrace allocates/frees bt when:
        # echo 1/0 > /sys/block/sda/sda1/trace/enable
      
      and registers/unregisters tracepoints when:
        # echo blk/nop > /debugfs/tracing/current_tracer
      or
        # echo 1/0 > /debugfs/tracing/tracing_enable
      
      The separatation of allocation and registeration causes 2 problems:
      
        1. current user-space blktrace still calls ioctl(TEARDOWN) when
           ioctl(SETUP) failed:
             # echo 1 > /sys/block/sda/sda1/trace/enable
             # blktrace /dev/sda
               BLKTRACESETUP: Device or resource busy
               ^C
           and now blk_probes_ref == -1
      
        2. Another way to make blk_probes_ref == -1:
           # plugin sdb && mount sdb1
           # echo 1 > /sys/block/sdb/sdb1/trace/enable
           # remove sdb
      
      This patch does the allocation and registeration when writing
      sdaX/trace/enable.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      17ba97e3
    • Li Zefan's avatar
      blktrace: make classic output more classic · 35ac51bf
      Li Zefan authored
      Impact: fix ftrace plugin timestamp output
      
      In the classic user-space blktrace, the output timestamp is sec.nsec
      not sec.usec.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      35ac51bf
    • Li Zefan's avatar
      blktrace: fix off-by-one bug · eb08f8eb
      Li Zefan authored
      'what' is used as the index of array what2act, so it can't >= the array size.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      eb08f8eb
    • Li Zefan's avatar
      blktrace: fix the original blktrace · 55547204
      Li Zefan authored
      Currently the original blktrace, which is using relay and is used via
      ioctl, is broken. You can use ftrace to see the output of blktrace,
      but user-space blktrace is unusable.
      
      It's broken by "blktrace: add ftrace plugin"
      (c71a8961)
      
       -	if (unlikely(bt->trace_state != Blktrace_running))
       +	if (unlikely(bt->trace_state != Blktrace_running || !blk_tracer_enabled))
      		return;
      
      With this patch, both ioctl and ftrace can be used, but of course you
      can't use both of them at the same time.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      55547204
    • Li Zefan's avatar
      blktrace: fix a race when creating blk_tree_root in debugfs · b5230b56
      Li Zefan authored
      t1                                t2
      ------                            ------
      do_blk_trace_setup()              do_blk_trace_setup()
        if (!blk_tree_root) {
                                          if (!blk_tree_root)
          blk_tree_root = create_dir()
                                            blk_tree_root = create_dir();
                                            (now blk_tree_root == NULL)
        ...
        dir = create_dir(name, blk_tree_root);
      
      Due to this race, t1 will create 'dir' in /debugfs but not /debugfs/block.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b5230b56
    • Li Zefan's avatar
      blktrace: fix timestamp in binary output · 6c051ce0
      Li Zefan authored
      I found the timestamp is wrong:
      
       # echo bin > trace_option
       # echo blk > current_tracer
       # cat trace_pipe | blkparse -i -
       8,0    0        0     0.000000000   504  A   W ...
       ...
       8,7    1        0     0.008534097     0  C   R ...
                  (should be 8.534097xxx)
      
      user-space blkparse expects the timestamp to be nanosecond.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6c051ce0
  2. 24 Mar, 2009 4 commits
    • Li Zefan's avatar
      blktrace: print human-readable act_mask · 09341997
      Li Zefan authored
      Impact: new feature, allow symbolic values in /debug/tracing/act_mask
      
      Print stringified act_mask instead of hex value:
      
       # cat act_mask
       read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta,
       discard,drv_data
       # echo "meta,write" > act_mask
       # cat act_mask
       write,meta
      
      Also:
       - make act_mask accept "ahead", "meta", "discard" and "drv_data"
       - use strsep() instead of strchr() to parse user input
       - return -EINVAL if a token is not found in the mask map
       - fix a bug that 'value' is unsigned, so it can < 0
       - propagate error value of blk_trace_mask2str() to userspace, but not
         always return -ENXIO.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49C8AB42.1000802@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      09341997
    • Li Zefan's avatar
      blktrace: fix t_error() · e0dc81be
      Li Zefan authored
      Impact: fix error flag output
      
      t_error() should return t->error but not t->sector.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49C8945F.5020802@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e0dc81be
    • Li Zefan's avatar
      blktrace: fix wrong calculation of RWBS · 65796348
      Li Zefan authored
      Impact: fix the output of IO type category characters
      
      Trace categories are the upper 16 bits, not the lower 16 bits.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49C89432.8010805@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      65796348
    • Li Zefan's avatar
      blktrace: mark ddir_act[] const · e4955c99
      Li Zefan authored
      Impact: cleanup
      
      ddir_act and what2act always stay immutable.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <49C89415.5080503@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e4955c99
  3. 21 Mar, 2009 7 commits
  4. 20 Mar, 2009 1 commit
  5. 19 Mar, 2009 9 commits
    • Jeff Moyer's avatar
      aio: lookup_ioctx can return the wrong value when looking up a bogus context · 65c24491
      Jeff Moyer authored
      The libaio test harness turned up a problem whereby lookup_ioctx on a
      bogus io context was returning the 1 valid io context from the list
      (harness/cases/3.p).
      
      Because of that, an extra put_iocontext was done, and when the process
      exited, it hit a BUG_ON in the put_iocontext macro called from exit_aio
      (since we expect a users count of 1 and instead get 0).
      
      The problem was introduced by "aio: make the lookup_ioctx() lockless"
      (commit abf137dd).
      
      Thanks to Zach for pointing out that hlist_for_each_entry_rcu will not
      return with a NULL tpos at the end of the loop, even if the entry was
      not found.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Acked-by: default avatarZach Brown <zach.brown@oracle.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      65c24491
    • Davide Libenzi's avatar
      eventfd: remove fput() call from possible IRQ context · 87c3a86e
      Davide Libenzi authored
      Remove a source of fput() call from inside IRQ context.  Myself, like Eric,
      wasn't able to reproduce an fput() call from IRQ context, but Jeff said he was
      able to, with the attached test program.  Independently from this, the bug is
      conceptually there, so we might be better off fixing it.  This patch adds an
      optimization similar to the one we already do on ->ki_filp, on ->ki_eventfd.
      Playing with ->f_count directly is not pretty in general, but the alternative
      here would be to add a brand new delayed fput() infrastructure, that I'm not
      sure is worth it.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      87c3a86e
    • Linus Torvalds's avatar
      Move cc-option to below arch-specific setup · d0115552
      Linus Torvalds authored
      Sam Ravnborg says:
       "We have several architectures that plays strange games with $(CC) and
        $(CROSS_COMPILE).
      
        So we need to postpone any use of $(call cc-option..) until we have
        included the arch specific Makefile so we try with the correct $(CC)
        version."
      Requested-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0115552
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 · caa81d67
      Linus Torvalds authored
      * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
        [S390] make page table upgrade work again
        [S390] make page table walking more robust
        [S390] Dont check for pfn_valid() in uaccess_pt.c
        [S390] ftrace/mcount: fix kernel stack backchain
        [S390] topology: define SD_MC_INIT to fix performance regression
        [S390] __div64_31 broken for CONFIG_MARCH_G5
      caa81d67
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 2d8620cb
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: fix waitqueue usage in hiddev
        HID: fix incorrect free in hiddev
      2d8620cb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · fe2fd6cc
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
        Btrfs: Clear space_info full when adding new devices
        Btrfs: Fix locking around adding new space_info
      fe2fd6cc
    • Linus Torvalds's avatar
      Fix race in create_empty_buffers() vs __set_page_dirty_buffers() · a8e7d49a
      Linus Torvalds authored
      Nick Piggin noticed this (very unlikely) race between setting a page
      dirty and creating the buffers for it - we need to hold the mapping
      private_lock until we've set the page dirty bit in order to make sure
      that create_empty_buffers() might not build up a set of buffers without
      the dirty bits set when the page is dirty.
      
      I doubt anybody has ever hit this race (and it didn't solve the issue
      Nick was looking at), but as Nick says: "Still, it does appear to solve
      a real race, which we should close."
      Acked-by: default avatarNick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8e7d49a
    • Linus Torvalds's avatar
      Add '-fwrapv' to gcc CFLAGS · 68df3755
      Linus Torvalds authored
      This makes sure that gcc doesn't try to optimize away wrapping
      arithmetic, which the kernel occasionally uses for overflow testing, ie
      things like
      
      	if (ptr + offset < ptr)
      
      which technically is undefined for non-unsigned types. See
      
      	http://bugzilla.kernel.org/show_bug.cgi?id=12597
      
      for details.
      
      Not all versions of gcc support it, so we need to make it conditional
      (it looks like it was introduced in gcc-3.4).
      Reminded-by: default avatarAlan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      68df3755
    • Frederic Weisbecker's avatar
      tracing/ring-buffer: fix non cpu hotplug case · 3bf832ce
      Frederic Weisbecker authored
      Impact: fix warning with irqsoff tracer
      
      The ring buffer allocates its buffers on pre-smp time (early_initcall).
      It means that, at first, only the boot cpu buffer is allocated and
      the ring-buffer cpumask only has the boot cpu set (cpu_online_mask).
      
      Later, the secondary cpu will show up and the ring-buffer will be notified
      about this event: the appropriate buffer will be allocated and the cpumask
      will be updated.
      
      Unfortunately, if !CONFIG_CPU_HOTPLUG, the ring-buffer will not be
      notified about the secondary cpus, meaning that the cpumask will have
      only the cpu boot set, and only one cpu buffer allocated.
      
      We fix that by using cpu_possible_mask if !CONFIG_CPU_HOTPLUG.
      
      This patch fixes the following warning with irqsoff tracer running:
      
      [  169.317794] WARNING: at kernel/trace/trace.c:466 update_max_tr_single+0xcc/0xf3()
      [  169.318002] Hardware name: AMILO Li 2727
      [  169.318002] Modules linked in:
      [  169.318002] Pid: 5624, comm: bash Not tainted 2.6.29-rc8-tip-02636-g6aafa6c #11
      [  169.318002] Call Trace:
      [  169.318002]  [<ffffffff81036182>] warn_slowpath+0xea/0x13d
      [  169.318002]  [<ffffffff8100b9d6>] ? ftrace_call+0x5/0x2b
      [  169.318002]  [<ffffffff8100b9d6>] ? ftrace_call+0x5/0x2b
      [  169.318002]  [<ffffffff8100b9d1>] ? ftrace_call+0x0/0x2b
      [  169.318002]  [<ffffffff8101ef10>] ? ftrace_modify_code+0xa9/0x108
      [  169.318002]  [<ffffffff8106e27f>] ? trace_hardirqs_off+0x25/0x27
      [  169.318002]  [<ffffffff8149afe7>] ? _spin_unlock_irqrestore+0x1f/0x2d
      [  169.318002]  [<ffffffff81064f52>] ? ring_buffer_reset_cpu+0xf6/0xfb
      [  169.318002]  [<ffffffff8106637c>] ? ring_buffer_reset+0x36/0x48
      [  169.318002]  [<ffffffff8106aeda>] update_max_tr_single+0xcc/0xf3
      [  169.318002]  [<ffffffff8100bc17>] ? sysret_check+0x22/0x5d
      [  169.318002]  [<ffffffff8106e3ea>] stop_critical_timing+0x142/0x204
      [  169.318002]  [<ffffffff8106e4cf>] trace_hardirqs_on_caller+0x23/0x25
      [  169.318002]  [<ffffffff8149ac28>] trace_hardirqs_on_thunk+0x3a/0x3c
      [  169.318002]  [<ffffffff8100bc17>] ? sysret_check+0x22/0x5d
      [  169.318002] ---[ end trace db76cbf775a750cf ]---
      
      Because this tracer may try to swap two cpu ring buffers for an
      unregistered cpu on the ring buffer.
      
      This patch might also fix a fair loss of traces due to unallocated buffers
      for secondary cpus.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-b: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237470453-5427-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3bf832ce
  6. 18 Mar, 2009 13 commits