1. 11 Nov, 2013 4 commits
    • Arnaldo Carvalho de Melo's avatar
      perf record: Synthesize non-exec MMAP records when --data used · 62605dc5
      Arnaldo Carvalho de Melo authored
      When perf_event_attr.mmap_data is set the kernel will generate
      PERF_RECORD_MMAP events when non-exec (data, SysV mem) mmaps are
      created, so we need to synthesize from /proc/pid/maps for existing
      threads, as we do for exec mmaps.
      
      Right now just 'perf record' does it, but any other tool that uses
      perf_event__synthesize_thread(s|map) can request it.
      Reported-by: default avatarDon Zickus <dzickus@redhat.com>
      Tested-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Bill Gray <bgray@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Fowles <rfowles@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ihwzraikx23ian9txinogvv2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      62605dc5
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Remove idx parm from constructor · ef503831
      Arnaldo Carvalho de Melo authored
      Most uses of the evsel constructor are followed by a call to
      perf_evlist__add with an idex of evlist->nr_entries, so make rename
      the current constructor to perf_evsel__new_idx and remove the need
      for passing the constructor for the common case.
      
      We still need the new_idx variant because the way groups are handled,
      with evsel->nr_members holding the number of entries in an evlist,
      partitioning the evlist into sublists inside a single linked list.
      
      This asks for a clarifying refactoring, but for now simplify the non
      parser cases, so that tool writers don't have to bother with evsel idx
      setting.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-zy9tskx6jqm2rmw7468zze2a@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ef503831
    • Patrick Palka's avatar
      perf ui tui progress: Don't force a refresh during progress update · d53e57d0
      Patrick Palka authored
      Each call to tui_progress__update() would forcibly refresh the entire
      screen.  This is somewhat inefficient and causes noticable flickering
      during the startup of perf-report, especially on large/slow terminals.
      
      It looks like the force-refresh in tui_progress__update() serves no
      purpose other than to clear the screen so that the progress bar of a
      previous operation does not subsume that of a subsequent operation.  But
      we can do just that in a much more efficient manner by clearing only the
      region that a previous progress bar may have occupied before repainting
      the new progress bar.  Then the force-refresh could be removed with no
      change in visuals.
      
      This patch disables the slow force-refresh in tui_progress__update() and
      instead calls SLsmg_fill_region() on the entire area that the progress
      bar may occupy before repainting it.  This change makes the startup of
      perf-report much faster and appear much "smoother".
      
      It turns out that this was a big bottleneck in the startup speed of
      perf-report -- with this patch, perf-report starts up ~2x faster (1.1s
      vs 0.55s) on my machines.  (These numbers were measured by running "time
      perf report" on an 8MB perf.data and pressing 'q' immediately.)
      Signed-off-by: default avatarPatrick Palka <patrick@parcs.ath.cx>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1382747149-9716-1-git-send-email-patrick@parcs.ath.cxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d53e57d0
    • Ingo Molnar's avatar
      Merge branch 'uprobes/core' of... · caea6cf5
      Ingo Molnar authored
      Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
      
      Pull uprobes fixes from Oleg Nesterov.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      caea6cf5
  2. 09 Nov, 2013 2 commits
    • Oleg Nesterov's avatar
      uprobes: Fix the memory out of bound overwrite in copy_insn() · 2ded0980
      Oleg Nesterov authored
      1. copy_insn() doesn't look very nice, all calculations are
         confusing and it is not immediately clear why do we read
         the 2nd page first.
      
      2. The usage of inode->i_size is wrong on 32-bit machines.
      
      3. "Instruction at end of binary" logic is simply wrong, it
         doesn't handle the case when uprobe->offset > inode->i_size.
      
         In this case "bytes" overflows, and __copy_insn() writes to
         the memory outside of uprobe->arch.insn.
      
         Yes, uprobe_register() checks i_size_read(), but this file
         can be truncated after that. All i_size checks are racy, we
         do this only to catch the obvious mistakes.
      
      Change copy_insn() to call __copy_insn() in a loop, simplify
      and fix the bytes/nbytes calculations.
      
      Note: we do not care if we read extra bytes after inode->i_size
      if we got the valid page. This is fine because the task gets the
      same page after page-fault, and arch_uprobe_analyze_insn() can't
      know how many bytes were actually read anyway.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      2ded0980
    • Oleg Nesterov's avatar
      uprobes: Fix the wrong usage of current->utask in uprobe_copy_process() · 70d7f987
      Oleg Nesterov authored
      Commit aa59c53f "uprobes: Change uprobe_copy_process() to dup
      xol_area" has a stupid typo, we need to setup t->utask->vaddr but
      the code wrongly uses current->utask.
      
      Even with this bug dup_xol_work() works "in practice", but only
      because get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE) likely
      returns the same address every time.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      70d7f987
  3. 07 Nov, 2013 9 commits
  4. 06 Nov, 2013 16 commits
  5. 05 Nov, 2013 9 commits