1. 19 Oct, 2018 2 commits
    • Waiman Long's avatar
      locking/lockdep: Make global debug_locks* variables read-mostly · 01a14bda
      Waiman Long authored
      Make the frequently used lockdep global variable debug_locks read-mostly.
      As debug_locks_silent is sometime used together with debug_locks,
      it is also made read-mostly so that they can be close together.
      
      With false cacheline sharing, cacheline contention problem can happen
      depending on what get put into the same cacheline as debug_locks.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-2-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      01a14bda
    • Waiman Long's avatar
      locking/lockdep: Fix debug_locks off performance problem · 9506a742
      Waiman Long authored
      It was found that when debug_locks was turned off because of a problem
      found by the lockdep code, the system performance could drop quite
      significantly when the lock_stat code was also configured into the
      kernel. For instance, parallel kernel build time on a 4-socket x86-64
      server nearly doubled.
      
      Further analysis into the cause of the slowdown traced back to the
      frequent call to debug_locks_off() from the __lock_acquired() function
      probably due to some inconsistent lockdep states with debug_locks
      off. The debug_locks_off() function did an unconditional atomic xchg
      to write a 0 value into debug_locks which had already been set to 0.
      This led to severe cacheline contention in the cacheline that held
      debug_locks.  As debug_locks is being referenced in quite a few different
      places in the kernel, this greatly slow down the system performance.
      
      To prevent that trashing of debug_locks cacheline, lock_acquired()
      and lock_contended() now checks the state of debug_locks before
      proceeding. The debug_locks_off() function is also modified to check
      debug_locks before calling __debug_locks_off().
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9506a742
  2. 17 Oct, 2018 2 commits
    • Waiman Long's avatar
      locking/pvqspinlock: Extend node size when pvqspinlock is configured · 0fa809ca
      Waiman Long authored
      The qspinlock code supports up to 4 levels of slowpath nesting using
      four per-CPU mcs_spinlock structures. For 64-bit architectures, they
      fit nicely in one 64-byte cacheline.
      
      For para-virtualized (PV) qspinlocks it needs to store more information
      in the per-CPU node structure than there is space for. It uses a trick
      to use a second cacheline to hold the extra information that it needs.
      So PV qspinlock needs to access two extra cachelines for its information
      whereas the native qspinlock code only needs one extra cacheline.
      
      Freshly added counter profiling of the qspinlock code, however, revealed
      that it was very rare to use more than two levels of slowpath nesting.
      So it doesn't make sense to penalize PV qspinlock code in order to have
      four mcs_spinlock structures in the same cacheline to optimize for a case
      in the native qspinlock code that rarely happens.
      
      Extend the per-CPU node structure to have two more long words when PV
      qspinlock locks are configured to hold the extra data that it needs.
      
      As a result, the PV qspinlock code will enjoy the same benefit of using
      just one extra cacheline like the native counterpart, for most cases.
      
      [ mingo: Minor changelog edits. ]
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539697507-28084-2-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0fa809ca
    • Waiman Long's avatar
      locking/qspinlock_stat: Count instances of nested lock slowpaths · 1222109a
      Waiman Long authored
      Queued spinlock supports up to 4 levels of lock slowpath nesting -
      user context, soft IRQ, hard IRQ and NMI. However, we are not sure how
      often the nesting happens.
      
      So add 3 more per-CPU stat counters to track the number of instances where
      nesting index goes to 1, 2 and 3 respectively.
      
      On a dual-socket 64-core 128-thread Zen server, the following were the
      new stat counter values under different circumstances:
      
               State                         slowpath   index1   index2   index3
               -----                         --------   ------   ------   -------
        After bootup                         1,012,150    82       0        0
        After parallel build + perf-top    125,195,009    82       0        0
      
      So the chance of having more than 2 levels of nesting is extremely low.
      
      [ mingo: Minor changelog edits. ]
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539697507-28084-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1222109a
  3. 16 Oct, 2018 6 commits
  4. 10 Oct, 2018 1 commit
  5. 09 Oct, 2018 2 commits
  6. 06 Oct, 2018 4 commits
  7. 05 Oct, 2018 1 commit
  8. 04 Oct, 2018 8 commits
    • Nadav Amit's avatar
      x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops · 494b5168
      Nadav Amit authored
      As described in:
      
        77b0bf55: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")
      
      GCC's inlining heuristics are broken with common asm() patterns used in
      kernel code, resulting in the effective disabling of inlining.
      
      The workaround is to set an assembly macro and call it from the inline
      assembly block. As a result GCC considers the inline assembly block as
      a single instruction. (Which it isn't, but that's the best we can get.)
      
      In this patch we wrap the paravirt call section tricks in a macro,
      to hide it from GCC.
      
      The effect of the patch is a more aggressive inlining, which also
      causes a size increase of kernel.
      
            text     data     bss      dec     hex  filename
        18147336 10226688 2957312 31331336 1de1408  ./vmlinux before
        18162555 10226288 2957312 31346155 1de4deb  ./vmlinux after (+14819)
      
      The number of static text symbols (non-inlined functions) goes down:
      
        Before: 40053
        After:  39942 (-111)
      
      [ mingo: Rewrote the changelog. ]
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: virtualization@lists.linux-foundation.org
      Link: http://lkml.kernel.org/r/20181003213100.189959-8-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      494b5168
    • Nadav Amit's avatar
      x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs · f81f8ad5
      Nadav Amit authored
      As described in:
      
        77b0bf55: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")
      
      GCC's inlining heuristics are broken with common asm() patterns used in
      kernel code, resulting in the effective disabling of inlining.
      
      The workaround is to set an assembly macro and call it from the inline
      assembly block. As a result GCC considers the inline assembly block as
      a single instruction. (Which it isn't, but that's the best we can get.)
      
      This patch increases the kernel size:
      
            text     data     bss      dec     hex  filename
        18146889 10225380 2957312 31329581 1de0d2d  ./vmlinux before
        18147336 10226688 2957312 31331336 1de1408  ./vmlinux after (+1755)
      
      But enables more aggressive inlining (and probably better branch decisions).
      
      The number of static text symbols in vmlinux is much lower:
      
       Before: 40218
       After:  40053 (-165)
      
      The assembly code gets harder to read due to the extra macro layer.
      
      [ mingo: Rewrote the changelog. ]
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20181003213100.189959-7-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f81f8ad5
    • Nadav Amit's avatar
      x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs · 77f48ec2
      Nadav Amit authored
      As described in:
      
        77b0bf55: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")
      
      GCC's inlining heuristics are broken with common asm() patterns used in
      kernel code, resulting in the effective disabling of inlining.
      
      The workaround is to set an assembly macro and call it from the inline
      assembly block - i.e. to macrify the affected block.
      
      As a result GCC considers the inline assembly block as a single instruction.
      
      This patch handles the LOCK prefix, allowing more aggresive inlining:
      
            text     data     bss      dec     hex  filename
        18140140 10225284 2957312 31322736 1ddf270  ./vmlinux before
        18146889 10225380 2957312 31329581 1de0d2d  ./vmlinux after (+6845)
      
      This is the reduction in non-inlined functions:
      
        Before: 40286
        After:  40218 (-68)
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20181003213100.189959-6-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      77f48ec2
    • Nadav Amit's avatar
      x86/refcount: Work around GCC inlining bug · 9e1725b4
      Nadav Amit authored
      As described in:
      
        77b0bf55: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")
      
      GCC's inlining heuristics are broken with common asm() patterns used in
      kernel code, resulting in the effective disabling of inlining.
      
      The workaround is to set an assembly macro and call it from the inline
      assembly block. As a result GCC considers the inline assembly block as
      a single instruction. (Which it isn't, but that's the best we can get.)
      
      This patch allows GCC to inline simple functions such as __get_seccomp_filter().
      
      To no-one's surprise the result is that GCC performs more aggressive (read: correct)
      inlining decisions in these senarios, which reduces the kernel size and presumably
      also speeds it up:
      
            text     data     bss      dec     hex  filename
        18140970 10225412 2957312 31323694 1ddf62e  ./vmlinux before
        18140140 10225284 2957312 31322736 1ddf270  ./vmlinux after (-958)
      
      16 fewer static text symbols:
      
         Before: 40302
          After: 40286 (-16)
      
      these got inlined instead.
      
      Functions such as kref_get(), free_user(), fuse_file_get() now get inlined. Hurray!
      
      [ mingo: Rewrote the changelog. ]
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20181003213100.189959-5-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9e1725b4
    • Nadav Amit's avatar
      x86/objtool: Use asm macros to work around GCC inlining bugs · c06c4d80
      Nadav Amit authored
      As described in:
      
        77b0bf55: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")
      
      GCC's inlining heuristics are broken with common asm() patterns used in
      kernel code, resulting in the effective disabling of inlining.
      
      In the case of objtool the resulting borkage can be significant, since all the
      annotations of objtool are discarded during linkage and never inlined,
      yet GCC bogusly considers most functions affected by objtool annotations
      as 'too large'.
      
      The workaround is to set an assembly macro and call it from the inline
      assembly block. As a result GCC considers the inline assembly block as
      a single instruction. (Which it isn't, but that's the best we can get.)
      
      This increases the kernel size slightly:
      
            text     data     bss      dec     hex filename
        18140829 10224724 2957312 31322865 1ddf2f1 ./vmlinux before
        18140970 10225412 2957312 31323694 1ddf62e ./vmlinux after (+829)
      
      The number of static text symbols (i.e. non-inlined functions) is reduced:
      
        Before:  40321
        After:   40302 (-19)
      
      [ mingo: Rewrote the changelog. ]
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christopher Li <sparse@chrisli.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-sparse@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181003213100.189959-4-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c06c4d80
    • Nadav Amit's avatar
      kbuild/Makefile: Prepare for using macros in inline assembly code to work... · 77b0bf55
      Nadav Amit authored
      kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs
      
      Using macros in inline assembly allows us to work around bugs
      in GCC's inlining decisions.
      
      Compile macros.S and use it to assemble all C files.
      Currently only x86 will use it.
      
      Background:
      
      The inlining pass of GCC doesn't include an assembler, so it's not aware
      of basic properties of the generated code, such as its size in bytes,
      or that there are such things as discontiuous blocks of code and data
      due to the newfangled linker feature called 'sections' ...
      
      Instead GCC uses a lazy and fragile heuristic: it does a linear count of
      certain syntactic and whitespace elements in inlined assembly block source
      code, such as a count of new-lines and semicolons (!), as a poor substitute
      for "code size and complexity".
      
      Unsurprisingly this heuristic falls over and breaks its neck whith certain
      common types of kernel code that use inline assembly, such as the frequent
      practice of putting useful information into alternative sections.
      
      As a result of this fresh, 20+ years old GCC bug, GCC's inlining decisions
      are effectively disabled for inlined functions that make use of such asm()
      blocks, because GCC thinks those sections of code are "large" - when in
      reality they are often result in just a very low number of machine
      instructions.
      
      This absolute lack of inlining provess when GCC comes across such asm()
      blocks both increases generated kernel code size and causes performance
      overhead, which is particularly noticeable on paravirt kernels, which make
      frequent use of these inlining facilities in attempt to stay out of the
      way when running on baremetal hardware.
      
      Instead of fixing the compiler we use a workaround: we set an assembly macro
      and call it from the inlined assembly block. As a result GCC considers the
      inline assembly block as a single instruction. (Which it often isn't but I digress.)
      
      This uglifies and bloats the source code - for example just the refcount
      related changes have this impact:
      
       Makefile                 |    9 +++++++--
       arch/x86/Makefile        |    7 +++++++
       arch/x86/kernel/macros.S |    7 +++++++
       scripts/Kbuild.include   |    4 +++-
       scripts/mod/Makefile     |    2 ++
       5 files changed, 26 insertions(+), 3 deletions(-)
      
      Yay readability and maintainability, it's not like assembly code is hard to read
      and maintain ...
      
      We also hope that GCC will eventually get fixed, but we are not holding
      our breath for that. Yet we are optimistic, it might still happen, any decade now.
      
      [ mingo: Wrote new changelog describing the background. ]
      Tested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Acked-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kbuild@vger.kernel.org
      Link: http://lkml.kernel.org/r/20181003213100.189959-3-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      77b0bf55
    • Nadav Amit's avatar
      kbuild/arch/xtensa: Define LINKER_SCRIPT for the linker script · 35e76b99
      Nadav Amit authored
      Define the LINKER_SCRIPT when building the linker script as being done
      in other architectures. This is required, because upcoming Makefile changes
      would otherwise break things.
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Acked-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-xtensa@linux-xtensa.org
      Link: http://lkml.kernel.org/r/20181003213100.189959-2-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      35e76b99
    • Ingo Molnar's avatar
      c0554d2d
  9. 03 Oct, 2018 3 commits
  10. 02 Oct, 2018 11 commits
    • Greg Kroah-Hartman's avatar
      Merge tag 'fbdev-v4.19-rc7' of https://github.com/bzolnier/linux · 1d2ba7fe
      Greg Kroah-Hartman authored
      Bartlomiej writes:
        "fbdev fixes for v4.19-rc7:
      
         - fix OMAPFB_MEMORY_READ ioctl to not leak kernel memory in omapfb driver
           (Tomi Valkeinen)
      
         - add missing prepare/unprepare clock operations in pxa168fb driver
           (Lubomir Rintel)
      
         - add nobgrt option in efifb driver to disable ACPI BGRT logo restore
           (Hans de Goede)
      
         - fix spelling mistake in fall-through annotation in stifb driver
           (Gustavo A. R. Silva)
      
         - fix URL for uvesafb repository in the documentation (Adam Jackson)"
      
      * tag 'fbdev-v4.19-rc7' of https://github.com/bzolnier/linux:
        video/fbdev/stifb: Fix spelling mistake in fall-through annotation
        uvesafb: Fix URLs in the documentation
        efifb: BGRT: Add nobgrt option
        fbdev/omapfb: fix omapfb_memory_read infoleak
        pxa168fb: prepare the clock
      1d2ba7fe
    • Greg Kroah-Hartman's avatar
      Merge tag 'mmc-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 5e0b19ac
      Greg Kroah-Hartman authored
      Ulf writes:
        "MMC core:
          - Fixup conversion of debounce time to/from ms/us
      
         MMC host:
          - sdhi: Fixup whitelisting for Gen3 types"
      
      * tag 'mmc-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: slot-gpio: Fix debounce time to use miliseconds again
        mmc: core: Fix debounce time to use microseconds
        mmc: sdhi: sys_dmac: check for all Gen3 types when whitelisting
      5e0b19ac
    • Andrew Murray's avatar
      Documentation/lockstat: Fix trivial typo · bccb484b
      Andrew Murray authored
      Fix incorrect line number in example output
      Signed-off-by: default avatarAndrew Murray <andrew.murray@arm.com>
      Cc: Jiri Kosina <trivial@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1538391663-54524-1-git-send-email-andrew.murray@arm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      bccb484b
    • Andrea Parri's avatar
      locking/memory-barriers: Replace smp_cond_acquire() with smp_cond_load_acquire() · 2f359c7e
      Andrea Parri authored
      Amend the changes in commit:
      
        1f03e8d2 ("locking/barriers: Replace smp_cond_acquire() with smp_cond_load_acquire()")
      
      ... by updating the documentation accordingly.
      
      Also remove some obsolete information related to the implementation.
      Signed-off-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Cc: Akira Yokosawa <akiyks@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Daniel Lustig <dlustig@nvidia.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jade Alglave <j.alglave@ucl.ac.uk>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luc Maranget <luc.maranget@inria.fr>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-arch@vger.kernel.org
      Cc: parri.andrea@gmail.com
      Link: http://lkml.kernel.org/r/20180926182920.27644-5-paulmck@linux.ibm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2f359c7e
    • Paul E. McKenney's avatar
      tools/memory-model: Add more LKMM limitations · d8fa25c4
      Paul E. McKenney authored
      This commit adds more detail about compiler optimizations and
      not-yet-modeled Linux-kernel APIs.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akiyks@gmail.com
      Cc: boqun.feng@gmail.com
      Cc: dhowells@redhat.com
      Cc: j.alglave@ucl.ac.uk
      Cc: linux-arch@vger.kernel.org
      Cc: luc.maranget@inria.fr
      Cc: npiggin@gmail.com
      Cc: parri.andrea@gmail.com
      Cc: stern@rowland.harvard.edu
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/20180926182920.27644-4-paulmck@linux.ibm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d8fa25c4
    • SeongJae Park's avatar
      tools/memory-model: Fix a README typo · 3d2046a6
      SeongJae Park authored
      This commit fixes a duplicate-"the" typo in README.
      Signed-off-by: default avatarSeongJae Park <sj38.park@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akiyks@gmail.com
      Cc: boqun.feng@gmail.com
      Cc: dhowells@redhat.com
      Cc: j.alglave@ucl.ac.uk
      Cc: linux-arch@vger.kernel.org
      Cc: luc.maranget@inria.fr
      Cc: npiggin@gmail.com
      Cc: parri.andrea@gmail.com
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/20180926182920.27644-3-paulmck@linux.ibm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3d2046a6
    • Alan Stern's avatar
      tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire · 6e89e831
      Alan Stern authored
      More than one kernel developer has expressed the opinion that the LKMM
      should enforce ordering of writes by locking.  In other words, given
      the following code:
      
      	WRITE_ONCE(x, 1);
      	spin_unlock(&s):
      	spin_lock(&s);
      	WRITE_ONCE(y, 1);
      
      the stores to x and y should be propagated in order to all other CPUs,
      even though those other CPUs might not access the lock s.  In terms of
      the memory model, this means expanding the cumul-fence relation.
      
      Locks should also provide read-read (and read-write) ordering in a
      similar way.  Given:
      
      	READ_ONCE(x);
      	spin_unlock(&s);
      	spin_lock(&s);
      	READ_ONCE(y);		// or WRITE_ONCE(y, 1);
      
      the load of x should be executed before the load of (or store to) y.
      The LKMM already provides this ordering, but it provides it even in
      the case where the two accesses are separated by a release/acquire
      pair of fences rather than unlock/lock.  This would prevent
      architectures from using weakly ordered implementations of release and
      acquire, which seems like an unnecessary restriction.  The patch
      therefore removes the ordering requirement from the LKMM for that
      case.
      
      There are several arguments both for and against this change.  Let us
      refer to these enhanced ordering properties by saying that the LKMM
      would require locks to be RCtso (a bit of a misnomer, but analogous to
      RCpc and RCsc) and it would require ordinary acquire/release only to
      be RCpc.  (Note: In the following, the phrase "all supported
      architectures" is meant not to include RISC-V.  Although RISC-V is
      indeed supported by the kernel, the implementation is still somewhat
      in a state of flux and therefore statements about it would be
      premature.)
      
      Pros:
      
      	The kernel already provides RCtso ordering for locks on all
      	supported architectures, even though this is not stated
      	explicitly anywhere.  Therefore the LKMM should formalize it.
      
      	In theory, guaranteeing RCtso ordering would reduce the need
      	for additional barrier-like constructs meant to increase the
      	ordering strength of locks.
      
      	Will Deacon and Peter Zijlstra are strongly in favor of
      	formalizing the RCtso requirement.  Linus Torvalds and Will
      	would like to go even further, requiring locks to have RCsc
      	behavior (ordering preceding writes against later reads), but
      	they recognize that this would incur a noticeable performance
      	degradation on the POWER architecture.  Linus also points out
      	that people have made the mistake, in the past, of assuming
      	that locking has stronger ordering properties than is
      	currently guaranteed, and this change would reduce the
      	likelihood of such mistakes.
      
      	Not requiring ordinary acquire/release to be any stronger than
      	RCpc may prove advantageous for future architectures, allowing
      	them to implement smp_load_acquire() and smp_store_release()
      	with more efficient machine instructions than would be
      	possible if the operations had to be RCtso.  Will and Linus
      	approve this rationale, hypothetical though it is at the
      	moment (it may end up affecting the RISC-V implementation).
      	The same argument may or may not apply to RMW-acquire/release;
      	see also the second Con entry below.
      
      	Linus feels that locks should be easy for people to use
      	without worrying about memory consistency issues, since they
      	are so pervasive in the kernel, whereas acquire/release is
      	much more of an "experts only" tool.  Requiring locks to be
      	RCtso is a step in this direction.
      
      Cons:
      
      	Andrea Parri and Luc Maranget think that locks should have the
      	same ordering properties as ordinary acquire/release (indeed,
      	Luc points out that the names "acquire" and "release" derive
      	from the usage of locks).  Andrea points out that having
      	different ordering properties for different forms of acquires
      	and releases is not only unnecessary, it would also be
      	confusing and unmaintainable.
      
      	Locks are constructed from lower-level primitives, typically
      	RMW-acquire (for locking) and ordinary release (for unlock).
      	It is illogical to require stronger ordering properties from
      	the high-level operations than from the low-level operations
      	they comprise.  Thus, this change would make
      
      		while (cmpxchg_acquire(&s, 0, 1) != 0)
      			cpu_relax();
      
      	an incorrect implementation of spin_lock(&s) as far as the
      	LKMM is concerned.  In theory this weakness can be ameliorated
      	by changing the LKMM even further, requiring
      	RMW-acquire/release also to be RCtso (which it already is on
      	all supported architectures).
      
      	As far as I know, nobody has singled out any examples of code
      	in the kernel that actually relies on locks being RCtso.
      	(People mumble about RCU and the scheduler, but nobody has
      	pointed to any actual code.  If there are any real cases,
      	their number is likely quite small.)  If RCtso ordering is not
      	needed, why require it?
      
      	A handful of locking constructs (qspinlocks, qrwlocks, and
      	mcs_spinlocks) are built on top of smp_cond_load_acquire()
      	instead of an RMW-acquire instruction.  It currently provides
      	only the ordinary acquire semantics, not the stronger ordering
      	this patch would require of locks.  In theory this could be
      	ameliorated by requiring smp_cond_load_acquire() in
      	combination with ordinary release also to be RCtso (which is
      	currently true on all supported architectures).
      
      	On future weakly ordered architectures, people may be able to
      	implement locks in a non-RCtso fashion with significant
      	performance improvement.  Meeting the RCtso requirement would
      	necessarily add run-time overhead.
      
      Overall, the technical aspects of these arguments seem relatively
      minor, and it appears mostly to boil down to a matter of opinion.
      Since the opinions of senior kernel maintainers such as Linus,
      Peter, and Will carry more weight than those of Luc and Andrea, this
      patch changes the model in accordance with the maintainers' wishes.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Reviewed-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akiyks@gmail.com
      Cc: boqun.feng@gmail.com
      Cc: dhowells@redhat.com
      Cc: j.alglave@ucl.ac.uk
      Cc: linux-arch@vger.kernel.org
      Cc: luc.maranget@inria.fr
      Cc: npiggin@gmail.com
      Cc: parri.andrea@gmail.com
      Link: http://lkml.kernel.org/r/20180926182920.27644-2-paulmck@linux.ibm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6e89e831
    • Paul E. McKenney's avatar
      tools/memory-model: Add litmus-test naming scheme · c4f790f2
      Paul E. McKenney authored
      This commit documents the scheme used to generate the names for the
      litmus tests.
      
      [ paulmck: Apply feedback from Andrea Parri and Will Deacon. ]
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akiyks@gmail.com
      Cc: boqun.feng@gmail.com
      Cc: dhowells@redhat.com
      Cc: j.alglave@ucl.ac.uk
      Cc: linux-arch@vger.kernel.org
      Cc: luc.maranget@inria.fr
      Cc: npiggin@gmail.com
      Cc: parri.andrea@gmail.com
      Cc: stern@rowland.harvard.edu
      Link: http://lkml.kernel.org/r/20180926182920.27644-1-paulmck@linux.ibm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c4f790f2
    • Matthew Wilcox's avatar
      locking/spinlocks: Remove an instruction from spin and write locks · 27df8968
      Matthew Wilcox authored
      Both spin locks and write locks currently do:
      
       f0 0f b1 17             lock cmpxchg %edx,(%rdi)
       85 c0                   test   %eax,%eax
       75 05                   jne    [slowpath]
      
      This 'test' insn is superfluous; the cmpxchg insn sets the Z flag
      appropriately.  Peter pointed out that using atomic_try_cmpxchg_acquire()
      will let the compiler know this is true.  Comparing before/after
      disassemblies show the only effect is to remove this insn.
      
      Take this opportunity to make the spin & write lock code resemble each
      other more closely and have similar likely() hints.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarMatthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <longman@redhat.com>
      Link: http://lkml.kernel.org/r/20180820162639.GC25153@bombadil.infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      27df8968
    • Ard Biesheuvel's avatar
      jump_label: Fix NULL dereference bug in __jump_label_mod_update() · 77ac1c02
      Ard Biesheuvel authored
      Commit 19483677 ("jump_label: Annotate entries that operate on
      __init code earlier") refactored the code that manages runtime
      patching of jump labels in modules that are tied to static keys
      defined in other modules or in the core kernel.
      
      In the latter case, we may iterate over the static_key_mod linked
      list until we hit the entry for the core kernel, whose 'mod' field
      will be NULL, and attempt to dereference it to get at its 'state'
      member.
      
      So let's add a non-NULL check: this forces the 'init' argument of
      __jump_label_update() to false for static keys that are defined in
      the core kernel, which is appropriate given that __init annotated
      jump_label entries in the core kernel should no longer be active
      at this point (i.e., when loading modules).
      
      Fixes: 19483677 ("jump_label: Annotate entries that operate on ...")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20181001081324.11553-1-ard.biesheuvel@linaro.org
      77ac1c02
    • Ard Biesheuvel's avatar
      s390/vmlinux.lds: Move JUMP_TABLE_DATA into output section · 57d15877
      Ard Biesheuvel authored
      Commit e872267b ("jump_table: move entries into ro_after_init
      region") moved the __jump_table input section into the __ro_after_init
      output section, but inadvertently put the macro in the wrong place in
      the s390 linker script. Let's fix that.
      
      Fixes: e872267b ("jump_table: move entries into ro_after_init region")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: linux-s390@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20180930164950.3841-1-ard.biesheuvel@linaro.org
      57d15877