1. 20 Dec, 2007 6 commits
    • Milan Broz's avatar
      dm crypt: use bio_add_page · 91e10625
      Milan Broz authored
      Fix possible max_phys_segments violation in cloned dm-crypt bio.
      
      In write operation dm-crypt needs to allocate new bio request
      and run crypto operation on this clone. Cloned request has always
      the same size, but number of physical segments can be increased
      and violate max_phys_segments restriction.
      
      This can lead to data corruption and serious hardware malfunction.
      This was observed when using XFS over dm-crypt and at least
      two HBA controller drivers (arcmsr, cciss) recently.
      
      Fix it by using bio_add_page() call (which tests for other
      restrictions too) instead of constructing own biovec.
      
      All versions of dm-crypt are affected by this bug.
      
      Cc: stable@kernel.org
      Cc:  dm-crypt@saout.de
      Signed-off-by: default avatarMilan Broz <mbroz@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      91e10625
    • Neil Brown's avatar
      dm: merge max_hw_sector · 91212507
      Neil Brown authored
      Make sure dm honours max_hw_sectors of underlying devices
      
        We still have no firm testing evidence in support of this patch but
        believe it may help to resolve some bug reports.  - agk
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      91212507
    • Alasdair G Kergon's avatar
      dm: trigger change uevent on rename · 69267a30
      Alasdair G Kergon authored
      Insert a missing KOBJ_CHANGE notification when a device is renamed.
      
      Cc: Scott James Remnant <scott@ubuntu.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      69267a30
    • Milan Broz's avatar
      dm crypt: fix write endio · adfe4770
      Milan Broz authored
      Fix BIO_UPTODATE test for write io.
      
      Cc: stable@kernel.org
      Cc: dm-crypt@saout.de
      Signed-off-by: default avatarMilan Broz <mbroz@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      adfe4770
    • Paul Mundt's avatar
      dm mpath: hp requires scsi · d1622e89
      Paul Mundt authored
      With CONFIG_SCSI=n __scsi_print_sense() is never linked in.
      
      drivers/built-in.o: In function `hp_sw_end_io':
      dm-mpath-hp-sw.c:(.text+0x914f8): undefined reference to `__scsi_print_sense'
      
      Caught with a randconfig on current git.
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      d1622e89
    • Jun'ichi Nomura's avatar
      dm: table detect io beyond device · 512875bd
      Jun'ichi Nomura authored
      This patch fixes a panic on shrinking a DM device if there is
      outstanding I/O to the part of the device that is being removed.
      (Normally this doesn't happen - a filesystem would be resized first,
      for example.)
      
      The bug is that __clone_and_map() assumes dm_table_find_target()
      always returns a valid pointer.  It may fail if a bio arrives from the
      block layer but its target sector is no longer included in the DM
      btree.
      
      This patch appends an empty entry to table->targets[] which will
      be returned by a lookup beyond the end of the device.
      
      After calling dm_table_find_target(), __clone_and_map() and target_message()
      check for this condition using
      dm_target_is_valid().
      
      Sample test script to trigger oops:
      512875bd
  2. 19 Dec, 2007 24 commits
  3. 18 Dec, 2007 10 commits
    • Boaz Harrosh's avatar
      [SCSI] initio: bugfix for accessors patch · a169e637
      Boaz Harrosh authored
      patch: [SCSI] initio: convert to use the data buffer accessors had a
      small but fatal bug in that it didn't increment the pointer into the
      initio scatterlist descriptors as it looped over the block generated
      ones. Fixed here.
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      a169e637
    • FUJITA Tomonori's avatar
      [SCSI] st: fix kernel BUG at include/linux/scatterlist.h:59! · cd81621c
      FUJITA Tomonori authored
      This is caused by a missing scatterlist initialisation (it only shows
      up when sg list handling debugging is turned on).
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      cd81621c
    • Alan Cox's avatar
      [SCSI] initio: fix conflict when loading driver · 99f1f534
      Alan Cox authored
      > I have a scanner connected to a Initio INI-950 SCSI card and I recently
      > upgraded from SuSE 10.2 to 10.3.  The new kernel doesn't see any of my
      > devices.  I get the following in /var/log/messages:
      >
      > ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 16
      > initio: I/O port range 0x0 is busy.
      > ACPI: PCI interrupt for device 0000:00:0a.0 disabled
      
      Humm not a collision - thats a bug in the driver updating.  Looks like the
      changes I made and combined with Christoph's lost a line somewhere when I
      was merging it all.
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      99f1f534
    • Tony Battersby's avatar
      [SCSI] sym53c8xx: fix "irq X: nobody cared" regression · cedefa13
      Tony Battersby authored
      The patch described by the following excerpt from ChangeLog-2.6.24-rc1
      eventually causes a "irq X: nobody cared" error after a while:
      
      commit 99c9e0a1
      Author: Matthew Wilcox <matthew@wil.cx>
      Date:   Fri Oct 5 15:55:12 2007 -0400
      
          [SCSI] sym53c8xx: Make interrupt handler capable of returning IRQ_NONE
      
      After this happens, the kernel disables the IRQ, causing the SCSI card
      to stop working until the next reboot.  The problem is caused by the
      interrupt handler returning IRQ_NONE instead of IRQ_HANDLED after
      handling an interrupt-on-the-fly (INTF) condition.  The following patch
      fixes the problem.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarMatthew Wilcox <willy@linux.intel.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      cedefa13
    • James Bottomley's avatar
      [SCSI] dpt_i2o: driver is only 32 bit so don't set 64 bit DMA mask · c80ddf00
      James Bottomley authored
      This fixes a potential corruption bug where the truncation would cause
      reading or writing to the wrong memory area on machines with >4GB of
      main memory.
      
      Cc: Stable Kernel Tree <stable@kernel.org>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      c80ddf00
    • Tony Battersby's avatar
      [SCSI] sym53c8xx: fix free_irq() regression · 7ee2413c
      Tony Battersby authored
      The following commit changed the pointer passed to request_irq(), but
      failed to change the pointer passed to free_irq():
      
      commit 99c9e0a1
      Author: Matthew Wilcox <matthew@wil.cx>
      Date:   Fri Oct 5 15:55:12 2007 -0400
      
          [SCSI] sym53c8xx: Make interrupt handler capable of returning IRQ_NONE
      
          ...
      
      The result is that free_irq() doesn't actually take any action.  This
      patch fixes it.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      7ee2413c
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 · 3e3b3916
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
        x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
        genirq: revert lazy irq disable for simple irqs
        x86: also define AT_VECTOR_SIZE_ARCH
        x86: kprobes bugfix
        x86: jprobe bugfix
        timer: kernel/timer.c section fixes
        genirq: add unlocked version of set_irq_handler()
        clockevents: fix reprogramming decision in oneshot broadcast
        oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters
      3e3b3916
    • Ingo Molnar's avatar
      x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" · 4aae0702
      Ingo Molnar authored
      this is the tale of a full day spent debugging an ancient but elusive bug.
      
      after booting up thousands of random .config kernels, i finally happened
      to generate a .config that produced the following rare bootup failure
      on 32-bit x86:
      
      | ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
      | ..MP-BIOS bug: 8254 timer not connected to IO-APIC
      | ...trying to set up timer (IRQ0) through the 8259A ...  failed.
      | ...trying to set up timer as Virtual Wire IRQ... failed.
      | ...trying to set up timer as ExtINT IRQ... failed :(.
      | Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug
      | and send a report.  Then try booting with the 'noapic' option
      
      this bug has been reported many times during the years, but it was never
      reproduced nor fixed.
      
      the bug that i hit was extremely sensitive to .config details.
      
      First i did a .config-bisection - suspecting some .config detail.
      That led to CONFIG_X86_MCE: enabling X86_MCE magically made the bug disappear
      and the system would boot up just fine.
      
      Debugging my way through the MCE code ended up identifying two unlikely
      candidates: the thing that made a real difference to the hang was that
      X86_MCE did two printks:
      
       Intel machine check architecture supported.
       Intel machine check reporting enabled on CPU#1.
      
      Adding the same printks to a !CONFIG_X86_MCE kernel made the bug go away!
      
      this left timing as the main suspect: i experimented with adding various
      udelay()s to the arch/x86/kernel/io_apic_32.c:check_timer() function, and
      the race window turned out to be narrower than 30 microseconds (!).
      
      That made debugging especially funny, debugging without having printk
      ability before the bug hits is ... interesting ;-)
      
      eventually i started suspecting IRQ activities - those are pretty much the
      only thing that happen this early during bootup and have the timescale of
      a few dozen microseconds. Also, check_timer() changes the IRQ hardware
      in various creative ways, so the main candidate became IRQ0 interaction.
      
      i've added a counter to track timer irqs (on which core they arrived, at
      what exact time, etc.) and found that no timer IRQ would arrive after the
      bug condition hits - even if we re-enable IRQ0 and re-initialize the i8259A,
      but that we'd get a small number of timer irqs right around the time when we
      call the check_timer() function.
      
      Eventually i got the following backtrace triggered from debug code in the
      timer interrupt:
      
      ...trying to set up timer as Virtual Wire IRQ... failed.
      ...trying to set up timer as ExtINT IRQ...
      Pid: 1, comm: swapper Not tainted (2.6.24-rc5 #57)
      EIP: 0060:[<c044d57e>] EFLAGS: 00000246 CPU: 0
      EIP is at _spin_unlock_irqrestore+0x5/0x1c
      EAX: c0634178 EBX: 00000000 ECX: c4947d63 EDX: 00000246
      ESI: 00000002 EDI: 00010031 EBP: c04e0f2e ESP: f7c41df4
       DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
       CR0: 8005003b CR2: ffe04000 CR3: 00630000 CR4: 000006d0
       DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
       DR6: ffff0ff0 DR7: 00000400
        [<c05f5784>] setup_IO_APIC+0x9c3/0xc5c
      
      the spin_unlock() was called from init_8259A(). Wait ... we have an IRQ0
      entry while we are in the middle of setting up the local APIC, the i8259A
      and the PIT??
      
      That is certainly not how it's supposed to work! check_timer() was supposed
      to be called with irqs turned off - but this eroded away sometime in the
      past. This code would still work most of the time because this code runs
      very quickly, but just the right timing conditions are present and IRQ0
      hits in this small, ~30 usecs window, timer irqs stop and the system does
      not boot up. Also, given how early this is during bootup, the hang is
      very deterministic - but it would only occur on certain machines (and
      certain configs).
      
      The fix was quite simple: disable/restore interrupts properly in this
      function. With that in place the test-system now boots up just fine.
      
      (64-bit x86 io_apic_64.c had the same bug.)
      
      Phew! One down, only 1500 other kernel bugs are left ;-)
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      4aae0702
    • Steven Rostedt's avatar
      genirq: revert lazy irq disable for simple irqs · 971e5b35
      Steven Rostedt authored
      In commit 76d21601 lazy irq disabling
      was implemented, and the simple irq handler had a masking set to it.
      
      Remy Bohmer discovered that some devices in the ARM architecture
      would trigger the mask, but never unmask it. His patch to do the
      unmasking was questioned by Russell King about masking simple irqs
      to begin with. Looking further, it was discovered that the problems
      Remy was seeing was due to improper use of the simple handler by
      devices, and he later submitted patches to fix those. But the issue
      that was uncovered was that the simple handler should never mask.
      
      This patch reverts the masking in the simple handler.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      971e5b35
    • Jan Beulich's avatar
      x86: also define AT_VECTOR_SIZE_ARCH · 213fde71
      Jan Beulich authored
      The patch introducing this left out 64-bit x86 despite it also having
      extra entries.
      
      this solves Xen guest troubles.
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      213fde71