1. 08 Jun, 2017 37 commits
    • Paul E. McKenney's avatar
      srcu: Make exp_holdoff module parameter be static · b5815e6c
      Paul E. McKenney authored
      Because exp_holdoff is not used outside of srcutree.c, it can be static.
      This commit therefore makes this change.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b5815e6c
    • Paul E. McKenney's avatar
      rcu: Update rcu_bootup_announce_oddness() · 17c7798b
      Paul E. McKenney authored
      This commit updates rcu_bootup_announce_oddness() to check additional
      Kconfig options and module/boot parameters.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      17c7798b
    • Paul E. McKenney's avatar
      rcu: Print out rcupdate.c non-default boot-time settings · 59d80fd8
      Paul E. McKenney authored
      This commit adds a rcupdate_announce_bootup_oddness() function to
      print out non-default values of significant kernel boot parameter
      settings to aid in debugging.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      59d80fd8
    • Paul E. McKenney's avatar
      rcu: Add preemptibility checks in rcu_sched_qs() and rcu_bh_qs() · f4687d26
      Paul E. McKenney authored
      This commit adds WARN_ON_ONCE() calls that trigger if either
      rcu_sched_qs() or rcu_bh_qs() are invoked with preemption enabled.
      In the immortal words of Peter Zijlstra: "these are much harder to ignore
      than comments".
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f4687d26
    • Paul E. McKenney's avatar
      doc: Take tail recursion into account in RCU requirements · c75e9caa
      Paul E. McKenney authored
      This commit classifies tail recursion as an alternative way to write
      a loop, with similar limitations.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c75e9caa
    • Paul E. McKenney's avatar
      srcu: Document auto-expediting requirement · 09f501a0
      Paul E. McKenney authored
      This commit documents the auto-expediting requirement satisfied by
      commits 2da4b2a7 ("srcu: Expedite first synchronize_srcu() when idle")
      and 22607d66 ("srcu: Specify auto-expedite holdoff time").
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      09f501a0
    • Paul E. McKenney's avatar
      rcutorture: Add "git diff" output to testid.txt file · 5d9853f3
      Paul E. McKenney authored
      Currently, when running from a git archive, the testid.txt file contains
      only the branch name, the output of "git status", and the SHA-1 of
      the current HEAD.  This is useful, but does not uniquely identify the
      source code that was built.  This commit therefore adds the output of
      "git diff HEAD", which means that if two testid.txt files compare equal,
      they correspond to exactly the same source code.  Give or take the
      possibility of SHA-1 collisions, that is.  ;-)
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5d9853f3
    • Paul E. McKenney's avatar
      rcuperf: Add writer_holdoff boot parameter · 820687a7
      Paul E. McKenney authored
      This commit adds a writer_holdoff boot parameter to rcuperf, which is
      intended to be used to test Tree SRCU's auto-expediting.  This
      boot parameter is in microseconds, and defaults to zero (that is,
      disabled).  Set it to a bit larger than srcutree.exp_holdoff,
      keeping the nanosecond/microsecond conversion, to force Tree SRCU
      to auto-expedite more aggressively.
      
      This commit also adds documentation for this parameter, and fixes some
      alphabetization while in the neighborhood.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      820687a7
    • Priyalee Kushwaha's avatar
      srcu-cbmc: Use /usr/bin/awk instead of /bin/awk · b562b85c
      Priyalee Kushwaha authored
      Most OS distribution have awk in /usr/bin not in /bin
      Without this patch, kernel-devsrc fails to build as
      runtime dependency for srcu-cbmc script /bin/awk is
      not found.
      Signed-off-by: default avatarKushwaha, Priyalee <priyalee.kushwaha@intel.com>
      Acked-by: default avatarLance Roy <ldr709@gmail.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b562b85c
    • Paul E. McKenney's avatar
      rcuperf: Set more user-friendly defaults · 492b95e5
      Paul E. McKenney authored
      Common-case use of rcuperf must set rcuperf.nreaders=0 and if not built
      as a module, rcuperf.shutdown.  This commit therefore sets the default
      for rcuperf.nreaders to zero and sets the default for rcuperf.shutdown
      to zero if rcuperf is built as a module and to one otherwise.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      492b95e5
    • Paul E. McKenney's avatar
      srcu: Shrink Tiny SRCU a bit more · 3ddf20c9
      Paul E. McKenney authored
      This commit rearranges Tiny SRCU's srcu_struct structure, substitutes
      u8 for bool, and shrinks counters down to short.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3ddf20c9
    • Paul E. McKenney's avatar
      rcutorture: Reduce CPUs dedicated to testing Classic SRCU · 41f36481
      Paul E. McKenney authored
      Given that the plan is to retire Classic SRCU in the near future, this
      commit reduces the number of CPUs dedicated to testing Classic SRCU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      41f36481
    • Paul E. McKenney's avatar
      srcu: Make Classic and Tree SRCU announce themselves at bootup · 1f4f6da1
      Paul E. McKenney authored
      Currently, the only way to tell whether a given kernel is running
      Classic, Tiny, or Tree SRCU is to look at the .config file, which
      can easily be lost or associated with the wrong kernel.  This commit
      therefore has Classic and Tree SRCU identify themselves at boot time.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1f4f6da1
    • Paul E. McKenney's avatar
      rcuperf: Add the ability to test tiny RCU flavors · 59ca3f9f
      Paul E. McKenney authored
      This commit adds a TINY rcuperf test scenario, which allows performance
      testing of Tiny RCU and Tiny SRCU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      59ca3f9f
    • Stan Drozd's avatar
      docs: Fix typo in Documentation/memory-barriers.txt · 35bdc72a
      Stan Drozd authored
      This commit changes "architecure" to the correct spelling,
      "architecture".
      Signed-off-by: default avatarStan Drozd <drozdziak1@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      35bdc72a
    • Paul E. McKenney's avatar
      atomics: Add header comment so spin_unlock_wait() · 6016ffc3
      Paul E. McKenney authored
      There is material describing the ordering guarantees provided by
      spin_unlock_wait(), but it is not necessarily easy to find.  This commit
      therefore adds a docbook header comment to this function informally
      describing its semantics.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      6016ffc3
    • Paul E. McKenney's avatar
      doc/atomic_ops: Clarify smp_mb__{before,after}_atomic() · 79269ee3
      Paul E. McKenney authored
      This commit explicitly states that surrounding a non-value-returning
      atomic read-modify atomic operations provides full ordering, just as
      is provided by value-returning atomic read-modify-write operations.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      79269ee3
    • Paul E. McKenney's avatar
      rcuperf: Add test for dynamically initialized srcu_struct · f60cb4d4
      Paul E. McKenney authored
      This commit adds a perf_type of "srcud", which species that rcuperf
      test SRCU on a dynamically initialized srcu_struct.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f60cb4d4
    • Paul E. McKenney's avatar
      checkpatch: Remove checks for expedited grace periods · 98953135
      Paul E. McKenney authored
      There was a time when the expedited grace-period primitives
      (synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(), and
      synchronize_sched_expedited()) used rather antisocial kernel
      facilities like try_stop_cpus().  However, they have since been
      housebroken to use only single-CPU IPIs, and typically cause less
      disturbance than a scheduling-clock interrupt.  Furthermore, this
      disturbance can be eliminated entirely using NO_HZ_FULL on the
      one hand or the rcupdate.rcu_normal boot parameter on the other.
      
      This commit therefore removes checkpatch's complaints about use
      of the expedited RCU primitives.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      98953135
    • Paul E. McKenney's avatar
      rcu: Make sync_rcu_preempt_exp_done() return bool · dcfc315b
      Paul E. McKenney authored
      The sync_rcu_preempt_exp_done() function returns a logical expression,
      but its return type is nevertheless int.  This commit therefore changes
      the return type to bool.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      dcfc315b
    • Paul E. McKenney's avatar
      rcuperf: Add a Kconfig-fragment file for Classic SRCU · ced8d6fd
      Paul E. McKenney authored
      This commit adds a Kconfig-fragment file for Classic SRCU to ease
      performance comparisons with Tree SRCU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      ced8d6fd
    • Paul E. McKenney's avatar
      rcuperf: Add ability to performance-test call_rcu() and friends · 881ed593
      Paul E. McKenney authored
      This commit upgrades rcuperf so that it can do performance testing on
      asynchronous grace-period primitives such as call_srcu().  There is
      a new rcuperf.gp_async module parameter that specifies this new behavior,
      with the pre-existing rcuperf.gp_exp testing expedited grace periods such as
      synchronize_rcu_expedited, and with the default being to test synchronous
      non-expedited grace periods such as synchronize_rcu().
      
      There is also a new rcuperf.gp_async_max module parameter that specifies
      the maximum number of outstanding callbacks per writer kthread, defaulting
      to 1,000.  When this limit is exceeded, the writer thread invokes the
      appropriate flavor of rcu_barrier() to wait for callbacks to drain.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Removed the redundant initialization noted by Arnd Bergmann. ]
      881ed593
    • Paul E. McKenney's avatar
      rcu: Remove obsolete reference to synchronize_kernel() · e28371c8
      Paul E. McKenney authored
      The synchronize_kernel() primitive was removed in favor of
      synchronize_sched() more than a decade ago, and it seems likely that
      rather few kernel hackers are familiar with it.  Its continued presence
      is therefore providing more confusion than enlightenment.  This commit
      therefore removes the reference from the synchronize_sched() header
      comment, and adds the corresponding information to the synchronize_rcu(0
      header comment.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e28371c8
    • Paul E. McKenney's avatar
      rcuperf: Remove conflicting Kconfig options · 1dcf2806
      Paul E. McKenney authored
      The TREE and TREE54 rcuperf scenarios' Kconfig fragment files specified
      conflicting values for CONFIG_RCU_TRACE.  This commit therefore removes
      the =n line in favor of the =y line.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1dcf2806
    • Paul E. McKenney's avatar
      rcuperf: Defer expedited/normal check to end of test · 9683937d
      Paul E. McKenney authored
      Current rcuperf startup checks to see if the user asked to measure
      only expedited grace periods, yet constrained all grace periods to be
      normal, or if the user asked to measure only normal grace periods, yet
      constrained all grace periods to be expedited.  Useless tests of this
      sort are aborted.
      
      Unfortunately, making RCU work through the mid-boot dead zone [1] puts
      RCU into expedited-only mode during that zone.  Which happens to also
      be the exact time that rcuperf carries out the aforementioned check.
      So if the user asks rcuperf to measure only normal grace periods (the
      default), rcuperf will now always complain and terminate the test.
      
      This commit therefore moves the checks to rcu_perf_cleanup().  This has
      the disadvantage of failing to abort useless tests, but avoids the need to
      create yet another kthread and the need to do fiddly checks involving the
      holdoff time.  (Yes, another approach is to do the checks in a late-stage
      init function, but that would require some way to communicate badness
      to rcuperf's kthreads, and seems not worth the bother.)
      
      [1] https://lwn.net/Articles/716148/Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      9683937d
    • Paul E. McKenney's avatar
      rcu: Complain if blocking in preemptible RCU read-side critical section · 5b72f964
      Paul E. McKenney authored
      Although preemptible RCU allows its read-side critical sections to be
      preempted, general blocking is forbidden.  The reason for this is that
      excessive preemption times can be handled by CONFIG_RCU_BOOST=y, but a
      voluntarily blocked task doesn't care how high you boost its priority.
      Because preemptible RCU is a global mechanism, one ill-behaved reader
      hurts everyone.  Hence the prohibition against general blocking in
      RCU-preempt read-side critical sections.  Preemption yes, blocking no.
      
      This commit enforces this prohibition.
      
      There is a special exception for the -rt patchset (which they kindly
      volunteered to implement):  It is OK to block (as opposed to merely being
      preempted) within an RCU-preempt read-side critical section, but only if
      the blocking is subject to priority inheritance.  This exception permits
      CONFIG_RCU_BOOST=y to get -rt RCU readers out of trouble.
      
      Why doesn't this exception also apply to mainline's rt_mutex?  Because
      of the possibility that someone does general blocking while holding
      an rt_mutex.  Yes, the priority boosting will affect the rt_mutex,
      but it won't help with the task doing general blocking while holding
      that rt_mutex.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5b72f964
    • Paul E. McKenney's avatar
      srcu: Eliminate possibility of destructive counter overflow · 881ec9d2
      Paul E. McKenney authored
      Earlier versions of Tree SRCU were subject to a counter overflow bug that
      could theoretically result in too-short grace periods.  This commit
      eliminates this problem by adding an update-side memory barrier.
      The short explanation is that if the updater sums the unlock counts
      too late to see a given __srcu_read_unlock() increment, that CPU's
      next __srcu_read_lock() must see the new value of ->srcu_idx, thus
      incrementing the other bank of counters.  This eliminates the possibility
      of destructive counter overflow as long as the srcu_read_lock() nesting
      level does not exceed floor(ULONG_MAX/NR_CPUS/2), which should be an
      eminently reasonable nesting limit, especially on 64-bit systems.
      Reported-by: default avatarLance Roy <ldr709@gmail.com>
      Suggested-by: default avatarLance Roy <ldr709@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      881ec9d2
    • Paul E. McKenney's avatar
      rcutorture: Update test scenarios based on new Kconfig dependencies · 17ed2b6c
      Paul E. McKenney authored
      A number of the rcutorture test scenarios were not using the desired
      Kconfig options because dependencies were preventing the selections in the
      Kconfig-fragment files from being honored.  This commit therefore updates
      the Kconfig-fragment files to account for these changes in dependencies.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      17ed2b6c
    • Paul E. McKenney's avatar
      rcutorture: Correctly handle CONFIG_RCU_TORTURE_TEST_* options · 39687d6c
      Paul E. McKenney authored
      The rcutorture scripting handles the CONFIG_*_TORTURE_TEST Kconfig
      options specially, and therefore greps them out of the Kconfig-fragment
      files.  Unfortunately, a poor choice of grep pattern means that the
      CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP, CONFIG_RCU_TORTURE_TEST_SLOW_INIT,
      and CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT Kconfig options are also grepped
      out, preventing rcutorture from using them.  This commit therefore fixes
      the offending grep pattern to focus only on the CONFIG_*_TORTURE_TEST
      Kconfig options.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      39687d6c
    • Paul E. McKenney's avatar
      rcu: Prevent rcu_barrier() from starting needless grace periods · f92c734f
      Paul E. McKenney authored
      Currently rcu_barrier() uses call_rcu() to enqueue new callbacks
      on each CPU with a non-empty callback list.  This works, but means
      that rcu_barrier() forces grace periods that are not otherwise needed.
      The key point is that rcu_barrier() never needs to wait for a grace
      period, but instead only for all pre-existing callbacks to be invoked.
      This means that rcu_barrier()'s new callbacks should be placed in
      the callback-list segment containing the last pre-existing callback.
      
      This commit makes this change using the new rcu_segcblist_entrain()
      function.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      f92c734f
    • Paul E. McKenney's avatar
      rcutorture: Add a scenario for Classic SRCU · c0ee4500
      Paul E. McKenney authored
      A robust combination of paranoia and cowardice has resulted in retaining
      Classic SRCU (CONFIG_CLASSIC_SRCU) as a backup for the shiny new Tiny
      and Tree SRCU implementations.  If it is to be a viable backup, it of
      course needs to be tested.  This commit therefore adds an rcutorture
      scenario named SRCU-C for Classic SRCU.  This commit also adds this
      scenario to the set that are run by default.
      
      Once sufficient good experience has accumulated for Tiny and Tree SRCU,
      this test will be removed, along with the Classic SRCU implementation
      itself.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c0ee4500
    • Paul E. McKenney's avatar
      rcutorture: Add a scenario for Tiny SRCU · 23ca0967
      Paul E. McKenney authored
      This commit adds an SRCU-t rcutorture scenario for the new Tiny SRCU
      implementation, removing the need to pass the --bootargs parameter to
      kvm.sh to run Tiny SRCU tests.  This commit also adds SRCU-t to the set
      of scenarios that are run by default.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      23ca0967
    • Paul E. McKenney's avatar
      rcutorture: Fix bug in reporting Kconfig mis-settings · 3c52f262
      Paul E. McKenney authored
      Kconfig "select" clauses can defeat Kconfig-fragment file attempts to
      clear a given Kconfig variable, and dependencies can defeat attempts to
      set a given Kconfig variable.  Because "select" clauses and dependencies
      can be added at any time, there needs to be a way to verify that the
      Kconfig-fragment file's requests were honored.  And there is, except
      that it is buggy.  This commit therefore provides the needed fix.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3c52f262
    • Paul E. McKenney's avatar
      rcutorture: Add three-level tree test for Tree SRCU · 8d6dd656
      Paul E. McKenney authored
      This commit adds a test for a three-level srcu_node tree for Tree SRCU
      in the existing SRCU-P scenario.  This requires enabling CONFIG_RCU_EXPERT,
      so the CONFIG_RCU_EXPERT=n scenario is now SRCU-N.  The reason for using
      SRCU-P for the tall tree is that preemption raises the possibility of
      locating more bugs than does the non-preemptive SRCU-N.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8d6dd656
    • Paul E. McKenney's avatar
      rcutorture: Add lockdep to one of the SRCU scenarios · 27dc0b1b
      Paul E. McKenney authored
      Back when SRCU was simpler, there wasn't much need for lockdep.
      However, with Tree SRCU, it is needed.  This commit therefore adds
      CONFIG_PROVE_LOCKING to the SRCU-P scenario.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      27dc0b1b
    • Paolo Bonzini's avatar
      srcu: Allow use of Classic SRCU from both process and interrupt context · 1123a604
      Paolo Bonzini authored
      Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
      down a guest running iperf on a VFIO assigned device.  This happens
      because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
      context, while a worker thread does the same inside kvm_set_irq().  If the
      interrupt happens while the worker thread is executing __srcu_read_lock(),
      updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
      ->srcu_lock_count[] field can be lost.
      
      The docs say you are not supposed to call srcu_read_lock() and
      srcu_read_unlock() from irq context, but KVM interrupt injection happens
      from (host) interrupt context and it would be nice if SRCU supported the
      use case.  KVM is using SRCU here not really for the "sleepable" part,
      but rather due to its IPI-free fast detection of grace periods.  It is
      therefore not desirable to switch back to RCU, which would effectively
      revert commit 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
      2014-01-16).
      
      However, the docs are overly conservative.  You can have an SRCU instance
      only has users in irq context, and you can mix process and irq context
      as long as process context users disable interrupts.  In addition,
      __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
      Classic SRCU.  For those two implementations, only srcu_read_lock()
      is unsafe.
      
      When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
      in commit 5a41344a ("srcu: Simplify __srcu_read_unlock() via
      this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
      Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
      the caller.  Tree SRCU however only does one increment, so on most
      architectures it is more efficient for __srcu_read_lock() to use
      this_cpu_inc(), and any performance differences appear to be down in
      the noise.
      
      Cc: stable@vger.kernel.org
      Fixes: 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
      Reported-by: default avatarLinu Cherian <linuc.decode@gmail.com>
      Suggested-by: default avatarLinu Cherian <linuc.decode@gmail.com>
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1123a604
    • Paolo Bonzini's avatar
      srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context · cdf7abc4
      Paolo Bonzini authored
      Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting
      down a guest running iperf on a VFIO assigned device.  This happens
      because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt
      context, while a worker thread does the same inside kvm_set_irq().  If the
      interrupt happens while the worker thread is executing __srcu_read_lock(),
      updates to the Classic SRCU ->lock_count[] field or the Tree SRCU
      ->srcu_lock_count[] field can be lost.
      
      The docs say you are not supposed to call srcu_read_lock() and
      srcu_read_unlock() from irq context, but KVM interrupt injection happens
      from (host) interrupt context and it would be nice if SRCU supported the
      use case.  KVM is using SRCU here not really for the "sleepable" part,
      but rather due to its IPI-free fast detection of grace periods.  It is
      therefore not desirable to switch back to RCU, which would effectively
      revert commit 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING",
      2014-01-16).
      
      However, the docs are overly conservative.  You can have an SRCU instance
      only has users in irq context, and you can mix process and irq context
      as long as process context users disable interrupts.  In addition,
      __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and
      Classic SRCU.  For those two implementations, only srcu_read_lock()
      is unsafe.
      
      When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(),
      in commit 5a41344a ("srcu: Simplify __srcu_read_unlock() via
      this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments.
      Therefore it kept __this_cpu_inc(), with preempt_disable/enable in
      the caller.  Tree SRCU however only does one increment, so on most
      architectures it is more efficient for __srcu_read_lock() to use
      this_cpu_inc(), and any performance differences appear to be down in
      the noise.
      
      Unlike Classic and Tree SRCU, Tiny SRCU does increments and decrements on
      a single variable.  Therefore, as Peter Zijlstra pointed out, Tiny SRCU's
      implementation already supports mixed-context use of srcu_read_lock()
      and srcu_read_unlock(), at least as long as uses of srcu_read_lock()
      and srcu_read_unlock() in each handler are nested and paired properly.
      In other words, it is still illegal to (say) invoke srcu_read_lock()
      in an interrupt handler and to invoke the matching srcu_read_unlock()
      in a softirq handler.  Therefore, the only change required for Tiny SRCU
      is to its comments.
      
      Fixes: 719d93cd ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING")
      Reported-by: default avatarLinu Cherian <linuc.decode@gmail.com>
      Suggested-by: default avatarLinu Cherian <linuc.decode@gmail.com>
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cdf7abc4
  2. 04 Jun, 2017 3 commits
    • Linus Torvalds's avatar
      Linux 4.12-rc4 · 3c2993b8
      Linus Torvalds authored
      3c2993b8
    • Richard Narron's avatar
      fs/ufs: Set UFS default maximum bytes per file · 239e250e
      Richard Narron authored
      This fixes a problem with reading files larger than 2GB from a UFS-2
      file system:
      
          https://bugzilla.kernel.org/show_bug.cgi?id=195721
      
      The incorrect UFS s_maxsize limit became a problem as of commit
      c2a9737f ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
      which started using s_maxbytes to avoid a page index overflow in
      do_generic_file_read().
      
      That caused files to be truncated on UFS-2 file systems because the
      default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it.
      
      Here I simply increase the default to a common value used by other file
      systems.
      Signed-off-by: default avatarRichard Narron <comet.berkeley@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Will B <will.brokenbourgh2877@gmail.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <stable@vger.kernel.org> # v4.9 and backports of c2a9737fSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      239e250e
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 125f42b0
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Bugfixes include:
      
         - Fix a typo in commit e0926934 ("NFS append COMMIT after
           synchronous COPY") that breaks copy offload
      
         - Fix the connect error propagation in xs_tcp_setup_socket()
      
         - Fix a lock leak in nfs40_walk_client_list
      
         - Verify that pNFS requests lie within the offset range of the layout
           segment"
      
      * tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        nfs: Mark unnecessarily extern functions as static
        SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()
        NFSv4.0: Fix a lock leak in nfs40_walk_client_list
        pnfs: Fix the check for requests in range of layout segment
        xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup()
        pNFS/flexfiles: missing error code in ff_layout_alloc_lseg()
        NFS fix COMMIT after COPY
      125f42b0