1. 06 Apr, 2018 5 commits
    • Ming Lei's avatar
      genirq/affinity: Spread irq vectors among present CPUs as far as possible · d3056812
      Ming Lei authored
      Commit 84676c1f ("genirq/affinity: assign vectors to all possible CPUs")
      tried to spread the interrupts accross all possible CPUs to make sure that
      in case of phsyical hotplug (e.g. virtualization) the CPUs which get
      plugged in after the device was initialized are targeted by a hardware
      queue and the corresponding interrupt.
      
      This has a downside in cases where the ACPI tables claim that there are
      more possible CPUs than present CPUs and the number of interrupts to spread
      out is smaller than the number of possible CPUs. These bogus ACPI tables
      are unfortunately not uncommon.
      
      In such a case the vector spreading algorithm assigns interrupts to CPUs
      which can never be utilized and as a consequence these interrupts are
      unused instead of being mapped to present CPUs. As a result the performance
      of the device is suboptimal.
      
      To fix this spread the interrupt vectors in two stages:
      
       1) Spread as many interrupts as possible among the present CPUs
      
       2) Spread the remaining vectors among non present CPUs
      
      On a 8 core system, where CPU 0-3 are present and CPU 4-7 are not present,
      for a device with 4 queues the resulting interrupt affinity is:
      
        1) Before 84676c1f ("genirq/affinity: assign vectors to all possible CPUs")
      	irq 39, cpu list 0
      	irq 40, cpu list 1
      	irq 41, cpu list 2
      	irq 42, cpu list 3
      
        2) With 84676c1f ("genirq/affinity: assign vectors to all possible CPUs")
      	irq 39, cpu list 0-2
      	irq 40, cpu list 3-4,6
      	irq 41, cpu list 5
      	irq 42, cpu list 7
      
        3) With the refined vector spread applied:
      	irq 39, cpu list 0,4
      	irq 40, cpu list 1,6
      	irq 41, cpu list 2,5
      	irq 42, cpu list 3,7
      
      On a 8 core system, where all CPUs are present the resulting interrupt
      affinity for the 4 queues is:
      
      	irq 39, cpu list 0,1
      	irq 40, cpu list 2,3
      	irq 41, cpu list 4,5
      	irq 42, cpu list 6,7
      
      This is independent of the number of CPUs which are online at the point of
      initialization because in such a system the offline CPUs can be easily
      onlined afterwards, while in non-present CPUs need to be plugged physically
      or virtually which requires external interaction.
      
      The downside of this approach is that in case of physical hotplug the
      interrupt vector spreading might be suboptimal when CPUs 4-7 are physically
      plugged. Suboptimal from a NUMA point of view and due to the single target
      nature of interrupt affinities the later plugged CPUs might not be targeted
      by interrupts at all.
      
      Though, physical hotplug systems are not the common case while the broken
      ACPI table disease is wide spread. So it's preferred to have as many
      interrupts as possible utilized at the point where the device is
      initialized.
      
      Block multi-queue devices like NVME create a hardware queue per possible
      CPU, so the goal of commit 84676c1f to assign one interrupt vector per
      possible CPU is still achieved even with physical/virtual hotplug.
      
      [ tglx: Changed from online to present CPUs for the first spreading stage,
        	renamed variables for readability sake, added comments and massaged
        	changelog ]
      Reported-by: default avatarLaurence Oberman <loberman@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Link: https://lkml.kernel.org/r/20180308105358.1506-5-ming.lei@redhat.com
      d3056812
    • Ming Lei's avatar
      genirq/affinity: Allow irq spreading from a given starting point · 1a2d0914
      Ming Lei authored
      To support two stage irq vector spreading, it's required to add a starting
      point to the spreading function. No functional change, just preparatory
      work for the actual two stage change.
      
      [ tglx: Renamed variables, tidied up the code and massaged changelog ]
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Cc: Laurence Oberman <loberman@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Link: https://lkml.kernel.org/r/20180308105358.1506-4-ming.lei@redhat.com
      1a2d0914
    • Ming Lei's avatar
      genirq/affinity: Move actual irq vector spreading into a helper function · b3e6aaa8
      Ming Lei authored
      No functional change, just prepare for converting to 2-stage irq vector
      spreading.
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Cc: Laurence Oberman <loberman@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Link: https://lkml.kernel.org/r/20180308105358.1506-3-ming.lei@redhat.com
      b3e6aaa8
    • Ming Lei's avatar
      genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask · 47778f33
      Ming Lei authored
      The following patches will introduce two stage irq spreading for improving
      irq spread on all possible CPUs.
      
      No functional change.
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Cc: Laurence Oberman <loberman@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Link: https://lkml.kernel.org/r/20180308105358.1506-2-ming.lei@redhat.com
      47778f33
    • Thomas Gleixner's avatar
      genirq/affinity: Don't return with empty affinity masks on error · 0211e12d
      Thomas Gleixner authored
      When the allocation of node_to_possible_cpumask fails, then
      irq_create_affinity_masks() returns with a pointer to the empty affinity
      masks array, which will cause malfunction.
      
      Reorder the allocations so the masks array allocation comes last and every
      failure path returns NULL.
      
      Fixes: 9a0ef98e ("genirq/affinity: Assign vectors to all present CPUs")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Ming Lei <ming.lei@redhat.com>
      0211e12d
  2. 04 Apr, 2018 3 commits
  3. 29 Mar, 2018 2 commits
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-4.17' of... · 71e6882b
      Thomas Gleixner authored
      Merge tag 'irqchip-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
      
      Pull irqchip updates for 4.17 from Marc Zyngier:
      
       - New Qualcomm PDC irqchip
       - New Microsemi Ocelot irqchip
       - Suspend/resume support for some oddball GICv3 irqchip
       - Better GIC/GICv3 support for kexec
       - Various cleanups and fixes
      71e6882b
    • Aniruddha Banerjee's avatar
      irqchip/gic: Take lock when updating irq type · aa08192a
      Aniruddha Banerjee authored
      Most MMIO GIC register accesses use a 1-hot bit scheme that
      avoids requiring any form of locking. This isn't true for the
      GICD_ICFGRn registers, which require a RMW sequence.
      
      Unfortunately, we seem to be missing a lock for these particular
      accesses, which could result in a race condition if changing the
      trigger type on any two interrupts within the same set of 16
      interrupts (and thus controlled by the same CFGR register).
      
      Introduce a private lock in the GIC common comde for this
      particular case, making it cover both GIC implementations
      in one go.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAniruddha Banerjee <aniruddhab@nvidia.com>
      [maz: updated changelog]
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      aa08192a
  4. 28 Mar, 2018 1 commit
    • Davidlohr Bueso's avatar
      irqchip/gic: Update supports_deactivate static key to modern api · d01d3274
      Davidlohr Bueso authored
      No changes in semantics -- key init is true; replace
      
      static_key_slow_dec       with   static_branch_disable
      static_key_true           with   static_branch_likely
      
      The first is because we never actually do any couterpart incs,
      thus there is really no reference counting semantics going on.
      Use the more proper static_branch_disable() construct.
      
      Also added a '_key' suffix to supports_deactivate, for better
      self documentation.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      d01d3274
  5. 23 Mar, 2018 1 commit
    • Shanker Donthineni's avatar
      irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling · 6eb486b6
      Shanker Donthineni authored
      Booting with GICR_CTLR.EnableLPI=1 is usually a bad idea, and may
      result in subtle memory corruption. Detecting this is thus pretty
      important.
      
      On detecting that LPIs are still enabled, we taint the kernel (because
      we're not sure of anything anymore), and try to disable LPIs. This can
      fail, as implementations are allowed to implement GICR_CTLR.EnableLPI
      as a one-way enable, meaning the redistributors cannot be reprogrammed
      with new tables.
      
      Should this happen, we fail probing the redistributor and warn the user
      that things are pretty dire.
      Signed-off-by: default avatarShanker Donthineni <shankerd@codeaurora.org>
      [maz: reworded changelog, minor comment and message changes]
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      6eb486b6
  6. 22 Mar, 2018 3 commits
  7. 20 Mar, 2018 7 commits
  8. 16 Mar, 2018 2 commits
  9. 14 Mar, 2018 12 commits
  10. 12 Mar, 2018 2 commits
  11. 11 Mar, 2018 2 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ed58d66f
      Linus Torvalds authored
      Pull x86/pti updates from Thomas Gleixner:
       "Yet another pile of melted spectrum related updates:
      
         - Drop native vsyscall support finally as it causes more trouble than
           benefit.
      
         - Make microcode loading more robust. There were a few issues
           especially related to late loading which are now surfacing because
           late loading of the IB* microcodes addressing spectre issues has
           become more widely used.
      
         - Simplify and robustify the syscall handling in the entry code
      
         - Prevent kprobes on the entry trampoline code which lead to kernel
           crashes when the probe hits before CR3 is updated
      
         - Don't check microcode versions when running on hypervisors as they
           are considered as lying anyway.
      
         - Fix the 32bit objtool build and a coment typo"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kprobes: Fix kernel crash when probing .entry_trampoline code
        x86/pti: Fix a comment typo
        x86/microcode: Synchronize late microcode loading
        x86/microcode: Request microcode on the BSP
        x86/microcode/intel: Look into the patch cache first
        x86/microcode: Do not upload microcode if CPUs are offline
        x86/microcode/intel: Writeback and invalidate caches before updating microcode
        x86/microcode/intel: Check microcode revision before updating sibling threads
        x86/microcode: Get rid of struct apply_microcode_ctx
        x86/spectre_v2: Don't check microcode versions when running under hypervisors
        x86/vsyscall/64: Drop "native" vsyscalls
        x86/entry/64/compat: Save one instruction in entry_INT80_compat()
        x86/entry: Do not special-case clone(2) in compat entry
        x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscalls
        x86/syscalls: Use proper syscall definition for sys_ioperm()
        x86/entry: Remove stale syscall prototype
        x86/syscalls/32: Simplify $entry == $compat entries
        objtool: Fix 32-bit build
      ed58d66f
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1ad5daa6
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "Just a single fix which adds a missing Kconfig dependency to avoid
        unmet dependency warnings"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependency
      1ad5daa6