1. 19 Feb, 2011 18 commits
    • Thomas Gleixner's avatar
      genirq: Move irq thread flags to core · 1535dfac
      Thomas Gleixner authored
      Soleley used in core code.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      1535dfac
    • Thomas Gleixner's avatar
      genirq: Mark polled irqs and defer the real handler · fe200ae4
      Thomas Gleixner authored
      With the chip.end() function gone we might run into a situation where
      a poll call runs and the real interrupt comes in, sees IRQ_INPROGRESS
      and disables the line. That might be a perfect working one, which will
      then be masked forever.
      
      So mark them polled while the poll runs. When the real handler sees
      IRQ_INPROGRESS it checks the poll flag and waits for the polling to
      complete. Add the necessary amount of sanity checks to it to avoid
      deadlocks.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      fe200ae4
    • Thomas Gleixner's avatar
      genirq: spurious: Run only one poller at a time · d05c65ff
      Thomas Gleixner authored
      No point in running concurrent pollers which confuse each other by
      setting PENDING.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      d05c65ff
    • Thomas Gleixner's avatar
      genirq: Do not poll disabled, percpu and timer interrupts · c7259cd7
      Thomas Gleixner authored
      There is no point in polling disabled lines.
      
      percpu does not make sense at all because we only poll on the cpu
      we're currently running on. Also polling per_cpu interrupts is racy as
      hell. The handler runs without locking so we might get a huge
      surprise.
      
      If the timer interrupt needs polling, then we wont get there anyway.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      c7259cd7
    • Thomas Gleixner's avatar
      genirq: Fixup poll handling · fa27271b
      Thomas Gleixner authored
      try_one_irq() contains redundant code and lots of useless checks for
      shared interrupts. Check for shared before setting IRQ_INPROGRESS and
      then call handle_IRQ_event() while pending. Shorter version with the
      same functionality.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      fa27271b
    • Thomas Gleixner's avatar
      genirq: Warn when handler enables interrupts · b738a50a
      Thomas Gleixner authored
      We run all handlers with interrupts disabled and expect them not to
      enable them. Warn when we catch one who does.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      b738a50a
    • Thomas Gleixner's avatar
      genirq: Plug race in report_bad_irq() · 1082687e
      Thomas Gleixner authored
      We cannot walk the action chain unlocked. Even if IRQ_INPROGRESS is
      set an action can be removed and we follow a null pointer. It's safe
      to take the lock there, because the code which removes the action will
      call synchronize_irq() which waits unlocked for IRQ_INPROGRESS going
      away.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      1082687e
    • Thomas Gleixner's avatar
      genirq: Remove redundant thread affinity setting · 2b879eaf
      Thomas Gleixner authored
      Thread affinity is already set by setup_affinity().
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      2b879eaf
    • Thomas Gleixner's avatar
      genirq: Do not copy affinity before set · 3b8249e7
      Thomas Gleixner authored
      While rumaging through arch code I found that there are a few
      workarounds which deal with the fact that the initial affinity setting
      from request_irq() copies the mask into irq_data->affinity before the
      chip code is called. In the normal path we unconditionally copy the
      mask when the chip code returns 0.
      
      Copy after the code is called and add a return code
      IRQ_SET_MASK_OK_NOCOPY for the chip functions, which prevents the
      copy. That way we see the real mask when the chip function decided to
      truncate it further as some arches do. IRQ_SET_MASK_OK is 0, which is
      the current behaviour.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      3b8249e7
    • Thomas Gleixner's avatar
      genirq: Always apply cpu online mask · 569bda8d
      Thomas Gleixner authored
      If the affinity had been set by the user, then a later request_irq()
      will honour that setting. But online cpus can have changed. So apply
      the online mask and for this case as well.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      569bda8d
    • Thomas Gleixner's avatar
      genirq: Rremove redundant check · b008207c
      Thomas Gleixner authored
      IRQ_NO_BALANCING is already checked in irq_can_set_affinity() above,
      no need to check it again.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      b008207c
    • Thomas Gleixner's avatar
      genirq: Simplify affinity related code · 1fa46f1f
      Thomas Gleixner authored
      There is lot of #ifdef CONFIG_GENERIC_PENDING_IRQ along with
      duplicated code in the irq core. Move the #ifdeffery into one place
      and cleanup the code so it's readable. No functional change.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      1fa46f1f
    • Thomas Gleixner's avatar
      genirq: Namespace cleanup · a0cd9ca2
      Thomas Gleixner authored
      The irq namespace has become quite convoluted. My bad.  Clean it up
      and deprecate the old functions. All new functions follow the scheme:
      
      irq number based:
          irq_set/get/xxx/_xxx(unsigned int irq, ...)
      
      irq_data based:
      	 irq_data_set/get/xxx/_xxx(struct irq_data *d, ....)
      
      irq_desc based:
      	 irq_desc_get_xxx(struct irq_desc *desc)
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      a0cd9ca2
    • Thomas Gleixner's avatar
      genirq: Add missing buslock to set_irq_type(), set_irq_wake() · 43abe43c
      Thomas Gleixner authored
      chips behind a slow bus cannot update the chip under desc->lock, but
      we miss the chip_buslock/chip_bus_sync_unlock() calls around the set
      type and set wake functions.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      43abe43c
    • Thomas Gleixner's avatar
      genirq: Make nr_irqs runtime expandable · e7bcecb7
      Thomas Gleixner authored
      We face more and more the requirement to expand nr_irqs at
      runtime. The reason are irq expanders which can not be detected in the
      early boot stage. So we speculate nr_irqs to have enough room. Further
      Xen needs extra irq numbers and we really want to avoid adding more
      "detection" code into the early boot. There is no real good reason why
      we need to limit nr_irqs at early boot.
      
      Allow the allocation code to expand nr_irqs. We have already 8k extra
      number space in the allocation bitmap, so lets use it.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      e7bcecb7
    • Thomas Gleixner's avatar
      Merge branch 'irq/urgent' into irq/core · 218502bf
      Thomas Gleixner authored
      Reason: Further patches are conflicting with mainline fixes
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      218502bf
    • Thomas Gleixner's avatar
      genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for now · 6d83f94d
      Thomas Gleixner authored
      With CONFIG_SHIRQ_DEBUG=y we call a newly installed interrupt handler
      in request_threaded_irq().
      
      The original implementation (commit a304e1b8) called the handler
      _BEFORE_ it was installed, but that caused problems with handlers
      calling disable_irq_nosync(). See commit 377bf1e4.
      
      It's braindead in the first place to call disable_irq_nosync in shared
      handlers, but ....
      
      Moving this call after we installed the handler looks innocent, but it
      is very subtle broken on SMP.
      
      Interrupt handlers rely on the fact, that the irq core prevents
      reentrancy.
      
      Now this debug call violates that promise because we run the handler
      w/o the IRQ_INPROGRESS protection - which we cannot apply here because
      that would result in a possibly forever masked interrupt line.
      
      A concurrent real hardware interrupt on a different CPU results in
      handler reentrancy and can lead to complete wreckage, which was
      unfortunately observed in reality and took a fricking long time to
      debug.
      
      Leave the code here for now. We want this debug feature, but that's
      not easy to fix. We really should get rid of those
      disable_irq_nosync() abusers and remove that function completely.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: stable@kernel.org # .28 -> .37
      6d83f94d
    • Thomas Gleixner's avatar
      genirq: Prevent access beyond allocated_irqs bitmap · c1ee6264
      Thomas Gleixner authored
      Lars-Peter Clausen pointed out:
      
         I stumbled upon this while looking through the existing archs using
         SPARSE_IRQ.  Even with SPARSE_IRQ the NR_IRQS is still the upper
         limit for the number of IRQs.
      
         Both PXA and MMP set NR_IRQS to IRQ_BOARD_START, with
         IRQ_BOARD_START being the number of IRQs used by the core.
      
         In various machine files the nr_irqs field of the ARM machine
         defintion struct is then set to "IRQ_BOARD_START + NR_BOARD_IRQS".
      
         As a result "nr_irqs" will greater then NR_IRQS which then again
         causes the "allocated_irqs" bitmap in the core irq code to be
         accessed beyond its size overwriting unrelated data.
      
      The core code really misses a sanity check there.
      
      This went unnoticed so far as by chance the compiler/linker places
      data behind that bitmap which gets initialized later on those affected
      platforms.
      
      So the obvious fix would be to add a sanity check in early_irq_init()
      and break all affected platforms. Though that check wants to be
      backported to stable as well, which will require to fix all known
      problematic platforms and probably some more yet not known ones as
      well. Lots of churn.
      
      A way simpler solution is to allocate a slightly larger bitmap and
      avoid the whole churn w/o breaking anything. Add a few warnings when
      an arch returns utter crap.
      Reported-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org # .37
      Cc: Haojian Zhuang <haojian.zhuang@marvell.com>
      Cc: Eric Miao <eric.y.miao@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      c1ee6264
  2. 18 Feb, 2011 13 commits
  3. 17 Feb, 2011 9 commits