1. 24 Mar, 2016 4 commits
    • Peter Zijlstra's avatar
      perf: Cure event->pending_disable race · 882f862d
      Peter Zijlstra authored
      [ Upstream commit 28a967c3 ]
      
      Because event_sched_out() checks event->pending_disable _before_
      actually disabling the event, it can happen that the event fires after
      it checks but before it gets disabled.
      
      This would leave event->pending_disable set and the queued irq_work
      will try and process it.
      
      However, if the event trigger was during schedule(), the event might
      have been de-scheduled by the time the irq_work runs, and
      perf_event_disable_local() will fail.
      
      Fix this by checking event->pending_disable _after_ we call
      event->pmu->del(). This depends on the latter being a compiler
      barrier, such that the compiler does not lift the load and re-creates
      the problem.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      882f862d
    • Peter Zijlstra's avatar
      perf: Do not double free · 5709e7ba
      Peter Zijlstra authored
      [ Upstream commit 13005627 ]
      
      In case of: err_file: fput(event_file), we'll end up calling
      perf_release() which in turn will free the event.
      
      Do not then free the event _again_.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5709e7ba
    • Yang Shi's avatar
      arm64: replace read_lock to rcu lock in call_step_hook · 143cf26c
      Yang Shi authored
      [ Upstream commit cf0a2543 ]
      
      BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
      in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
      Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8
      
      CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
      Hardware name: Freescale Layerscape 2085a RDB Board (DT)
      Call trace:
      [<ffff8000000885e8>] dump_backtrace+0x0/0x128
      [<ffff800000088734>] show_stack+0x24/0x30
      [<ffff80000079a7c4>] dump_stack+0x80/0xa0
      [<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0
      [<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40
      [<ffff8000007a2268>] rt_read_lock+0x40/0x58
      [<ffff800000085328>] single_step_handler+0x38/0xd8
      [<ffff800000082368>] do_debug_exception+0x58/0xb8
      Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
      7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
      7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
      7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
      7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
      7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
      7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
      7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
      7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
      7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
      [<ffff8000000833f4>] el1_dbg+0x18/0x6c
      
      This issue is similar with 62c6c61a("arm64: replace read_lock to rcu lock in
      call_break_hook"), but comes to single_step_handler.
      
      This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      143cf26c
    • Yang Shi's avatar
      arm64: replace read_lock to rcu lock in call_break_hook · 1a138f3e
      Yang Shi authored
      [ Upstream commit 62c6c61a ]
      
      BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
      in_atomic(): 0, irqs_disabled(): 128, pid: 342, name: perf
      1 lock held by perf/342:
       #0:  (break_hook_lock){+.+...}, at: [<ffffffc0000851ac>] call_break_hook+0x34/0xd0
      irq event stamp: 62224
      hardirqs last  enabled at (62223): [<ffffffc00010b7bc>] __call_rcu.constprop.59+0x104/0x270
      hardirqs last disabled at (62224): [<ffffffc0000fbe20>] vprintk_emit+0x68/0x640
      softirqs last  enabled at (0): [<ffffffc000097928>] copy_process.part.8+0x428/0x17f8
      softirqs last disabled at (0): [<          (null)>]           (null)
      CPU: 0 PID: 342 Comm: perf Not tainted 4.1.6-rt5 #4
      Hardware name: linux,dummy-virt (DT)
      Call trace:
      [<ffffffc000089968>] dump_backtrace+0x0/0x128
      [<ffffffc000089ab0>] show_stack+0x20/0x30
      [<ffffffc0007030d0>] dump_stack+0x7c/0xa0
      [<ffffffc0000c878c>] ___might_sleep+0x174/0x260
      [<ffffffc000708ac8>] __rt_spin_lock+0x28/0x40
      [<ffffffc000708db0>] rt_read_lock+0x60/0x80
      [<ffffffc0000851a8>] call_break_hook+0x30/0xd0
      [<ffffffc000085a70>] brk_handler+0x30/0x98
      [<ffffffc000082248>] do_debug_exception+0x50/0xb8
      Exception stack(0xffffffc00514fe30 to 0xffffffc00514ff50)
      fe20:                                     00000000 00000000 c1594680 0000007f
      fe40: ffffffff ffffffff 92063940 0000007f 0550dcd8 ffffffc0 00000000 00000000
      fe60: 0514fe70 ffffffc0 000be1f8 ffffffc0 0514feb0 ffffffc0 0008948c ffffffc0
      fe80: 00000004 00000000 0514fed0 ffffffc0 ffffffff ffffffff 9282a948 0000007f
      fea0: 00000000 00000000 9282b708 0000007f c1592820 0000007f 00083914 ffffffc0
      fec0: 00000000 00000000 00000010 00000000 00000064 00000000 00000001 00000000
      fee0: 005101e0 00000000 c1594680 0000007f c1594740 0000007f ffffffd8 ffffff80
      ff00: 00000000 00000000 00000000 00000000 c1594770 0000007f c1594770 0000007f
      ff20: 00665e10 00000000 7f7f7f7f 7f7f7f7f 01010101 01010101 00000000 00000000
      ff40: 928e4cc0 0000007f 91ff11e8 0000007f
      
      call_break_hook is called in atomic context (hard irq disabled), so replace
      the sleepable lock to rcu lock, replace relevant list operations to rcu
      version and call synchronize_rcu() in unregister_break_hook().
      
      And, replace write lock to spinlock in {un}register_break_hook.
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1a138f3e
  2. 23 Mar, 2016 4 commits
    • Jan Kara's avatar
      ext4: fix races of writeback with punch hole and zero range · f2b13259
      Jan Kara authored
      When doing delayed allocation, update of on-disk inode size is postponed
      until IO submission time. However hole punch or zero range fallocate
      calls can end up discarding the tail page cache page and thus on-disk
      inode size would never be properly updated.
      
      Make sure the on-disk inode size is updated before truncating page
      cache.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f2b13259
    • Jan Kara's avatar
      ext4: fix races between buffered IO and collapse / insert range · 181aaebd
      Jan Kara authored
      Current code implementing FALLOC_FL_COLLAPSE_RANGE and
      FALLOC_FL_INSERT_RANGE is prone to races with buffered writes and page
      faults. If buffered write or write via mmap manages to squeeze between
      filemap_write_and_wait_range() and truncate_pagecache() in the fallocate
      implementations, the written data is simply discarded by
      truncate_pagecache() although it should have been shifted.
      
      Fix the problem by moving filemap_write_and_wait_range() call inside
      i_mutex and i_mmap_sem. That way we are protected against races with
      both buffered writes and page faults.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      181aaebd
    • Jan Kara's avatar
      ext4: move unlocked dio protection from ext4_alloc_file_blocks() · 9621787d
      Jan Kara authored
      Currently ext4_alloc_file_blocks() was handling protection against
      unlocked DIO. However we now need to sometimes call it under i_mmap_sem
      and sometimes not and DIO protection ranks above it (although strictly
      speaking this cannot currently create any deadlocks). Also
      ext4_zero_range() was actually getting & releasing unlocked DIO
      protection twice in some cases. Luckily it didn't introduce any real bug
      but it was a land mine waiting to be stepped on.  So move DIO protection
      out from ext4_alloc_file_blocks() into the two callsites.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9621787d
    • Jan Kara's avatar
      ext4: fix races between page faults and hole punching · 248766f0
      Jan Kara authored
      Currently, page faults and hole punching are completely unsynchronized.
      This can result in page fault faulting in a page into a range that we
      are punching after truncate_pagecache_range() has been called and thus
      we can end up with a page mapped to disk blocks that will be shortly
      freed. Filesystem corruption will shortly follow. Note that the same
      race is avoided for truncate by checking page fault offset against
      i_size but there isn't similar mechanism available for punching holes.
      
      Fix the problem by creating new rw semaphore i_mmap_sem in inode and
      grab it for writing over truncate, hole punching, and other functions
      removing blocks from extent tree and for read over page faults. We
      cannot easily use i_data_sem for this since that ranks below transaction
      start and we need something ranking above it so that it can be held over
      the whole truncate / hole punching operation. Also remove various
      workarounds we had in the code to reduce race window when page fault
      could have created pages with stale mapping information.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarMingming Cao <mingming.cao@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      248766f0
  3. 22 Mar, 2016 26 commits
  4. 18 Mar, 2016 6 commits