1. 03 Sep, 2011 4 commits
    • Theodore Ts'o's avatar
      ext4: improve handling of conflicting mount options · 56889787
      Theodore Ts'o authored
      If the user explicitly specifies conflicting mount options for
      delalloc or dioread_nolock and data=journal, fail the mount, instead
      of printing a warning and continuing (since many user's won't look at
      dmesg and notice the warning).
      
      Also, print a single warning that data=journal implies that delayed
      allocation is not on by default (since it's not supported), and
      furthermore that O_DIRECT is not supported.  Improve the text in
      Documentation/filesystems/ext4.txt so this is clear there as well.
      
      Similarly, if the dioread_nolock mount option is specified when the
      file system block size != PAGE_SIZE, fail the mount instead of
      printing a warning message and ignoring the mount option.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      56889787
    • Allison Henderson's avatar
      ext4: fix 2nd xfstests 127 punch hole failure · 2be4751b
      Allison Henderson authored
      This patch fixes a second punch hole bug found by xfstests 127.
      
      This bug happens because punch hole needs to flush the pages
      of the hole to avoid race conditions.  But if the end of the
      hole is in the same page as i_size, the buffer heads beyond
      i_size need to be unmapped and the page needs to be zeroed
      after it is flushed.
      
      To correct this, the new ext4_discard_partial_page_buffers
      routine is used to zero and unmap the partial page
      beyond i_size if the end of the hole appears in the same
      page as i_size.
      
      The code has also been optimized to set the end of the hole
      to the page after i_size if the specified hole exceeds i_size,
      and the code that flushes the pages has been simplified.
      Signed-off-by: default avatarAllison Henderson <achender@linux.vnet.ibm.com>
      2be4751b
    • Allison Henderson's avatar
      ext4: fix xfstests 75, 112, 127 punch hole failure · ba06208a
      Allison Henderson authored
      This patch addresses a bug found by xfstests 75, 112, 127
      when blocksize = 1k
      
      This bug happens because the punch hole code only zeros
      out non block aligned regions of the page.  This means that if the
      blocks are smaller than a page, then the block aligned regions of
      the page inside the hole are left un-zeroed, and their buffer heads
      are still mapped.  This bug is corrected by using
      ext4_discard_partial_page_buffers to properly zero the partial page
      at the head and tail of the hole, and unmap the corresponding buffer
      heads
      
      This patch also addresses a bug reported by Lukas while working on a
      new patch to add discard support for loop devices using punch hole.
      The bug happened because of the first and last block number
      needed to be cast to a larger data type before calculating the
      byte offset, but since now we only need the byte offsets of the
      pages, we no longer even need to be calculating the byte offsets
      of the blocks.  The code to do the block offset calculations is
      removed in this patch.
      Signed-off-by: default avatarAllison Henderson <achender@linux.vnet.ibm.com>
      ba06208a
    • Allison Henderson's avatar
      ext4: Add new ext4_discard_partial_page_buffers routines · 4e96b2db
      Allison Henderson authored
      This patch adds two new routines: ext4_discard_partial_page_buffers
      and ext4_discard_partial_page_buffers_no_lock.
      
      The ext4_discard_partial_page_buffers routine is a wrapper
      function to ext4_discard_partial_page_buffers_no_lock.
      The wrapper function locks the page and passes it to
      ext4_discard_partial_page_buffers_no_lock.
      Calling functions that already have the page locked can call
      ext4_discard_partial_page_buffers_no_lock directly.
      
      The ext4_discard_partial_page_buffers_no_lock function
      zeros a specified range in a page, and unmaps the
      corresponding buffer heads.  Only block aligned regions of the
      page will have their buffer heads unmapped.  Unblock aligned regions
      will be mapped if needed so that they can be updated with the
      partial zero out.  This function is meant to
      be used to update a page and its buffer heads to be zeroed
      and unmapped when the corresponding blocks have been released
      or will be released.
      
      This routine is used in the following scenarios:
      * A hole is punched and the non page aligned regions
        of the head and tail of the hole need to be discarded
      
      * The file is truncated and the partial page beyond EOF needs
        to be discarded
      
      * The end of a hole is in the same page as EOF.  After the
        page is flushed, the partial page beyond EOF needs to be
        discarded.
      
      * A write operation begins or ends inside a hole and the partial
        page appearing before or after the write needs to be discarded
      
      * A write operation extends EOF and the partial page beyond EOF
        needs to be discarded
      
      This function takes a flag EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED
      which is used when a write operation begins or ends in a hole.
      When the EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED flag is used, only
      buffer heads that are already unmapped will have the corresponding
      regions of the page zeroed.
      Signed-off-by: default avatarAllison Henderson <achender@linux.vnet.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      4e96b2db
  2. 31 Aug, 2011 6 commits
    • Theodore Ts'o's avatar
      ext4: call ext4_handle_dirty_metadata with correct inode in ext4_dx_add_entry · 5930ea64
      Theodore Ts'o authored
      ext4_dx_add_entry manipulates bh2 and frames[0].bh, which are two buffer_heads
      that point to directory blocks assigned to the directory inode.  However, the
      function calls ext4_handle_dirty_metadata with the inode of the file that's
      being added to the directory, not the directory inode itself.  Therefore,
      correct the code to dirty the directory buffers with the directory inode, not
      the file inode.
      Signed-off-by: default avatarDarrick J. Wong <djwong@us.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      5930ea64
    • Darrick J. Wong's avatar
      ext4: ext4_mkdir should dirty dir_block with newly created directory inode · f9287c1f
      Darrick J. Wong authored
      ext4_mkdir calls ext4_handle_dirty_metadata with dir_block and the inode "dir".
      Unfortunately, dir_block belongs to the newly created directory (which is
      "inode"), not the parent directory (which is "dir").  Fix the incorrect
      association.
      Signed-off-by: default avatarDarrick J. Wong <djwong@us.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      f9287c1f
    • Darrick J. Wong's avatar
      ext4: ext4_rename should dirty dir_bh with the correct directory · bcaa9929
      Darrick J. Wong authored
      When ext4_rename performs a directory rename (move), dir_bh is a
      buffer that is modified to update the '..' link in the directory being
      moved (old_inode).  However, ext4_handle_dirty_metadata is called with
      the old parent directory inode (old_dir) and dir_bh, which is
      incorrect because dir_bh does not belong to the parent inode.  Fix
      this error.
      Signed-off-by: default avatarDarrick J. Wong <djwong@us.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      bcaa9929
    • Theodore Ts'o's avatar
      ext4: fake direct I/O mode for data=journal · 84ebd795
      Theodore Ts'o authored
      Currently attempts to open a file with O_DIRECT in data=journal mode
      causes the open to fail with -EINVAL.  This makes it very hard to test
      data=journal mode.  So we will let the open succeed, but then always
      fall back to O_DSYNC buffered writes.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      84ebd795
    • Theodore Ts'o's avatar
      ext2,ext3,ext4: don't inherit APPEND_FL or IMMUTABLE_FL for new inodes · 1cd9f097
      Theodore Ts'o authored
      This doesn't make much sense, and it exposes a bug in the kernel where
      attempts to create a new file in an append-only directory using
      O_CREAT will fail (but still leave a zero-length file).  This was
      discovered when xfstests #79 was generalized so it could run on all
      file systems.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc:stable@kernel.org
      1cd9f097
    • Jiaying Zhang's avatar
      ext4: remove i_mutex lock in ext4_evict_inode to fix lockdep complaining · 8c0bec21
      Jiaying Zhang authored
      The i_mutex lock and flush_completed_IO() added by commit 2581fdc8
      in ext4_evict_inode() causes lockdep complaining about potential
      deadlock in several places.  In most/all of these LOCKDEP complaints
      it looks like it's a false positive, since many of the potential
      circular locking cases can't take place by the time the
      ext4_evict_inode() is called; but since at the very least it may mask
      real problems, we need to address this.
      
      This change removes the flush_completed_IO() and i_mutex lock in
      ext4_evict_inode().  Instead, we take a different approach to resolve
      the software lockup that commit 2581fdc8 intends to fix.  Rather
      than having ext4-dio-unwritten thread wait for grabing the i_mutex
      lock of an inode, we use mutex_trylock() instead, and simply requeue
      the work item if we fail to grab the inode's i_mutex lock.
      
      This should speed up work queue processing in general and also
      prevents the following deadlock scenario: During page fault,
      shrink_icache_memory is called that in turn evicts another inode B.
      Inode B has some pending io_end work so it calls ext4_ioend_wait()
      that waits for inode B's i_ioend_count to become zero.  However, inode
      B's ioend work was queued behind some of inode A's ioend work on the
      same cpu's ext4-dio-unwritten workqueue.  As the ext4-dio-unwritten
      thread on that cpu is processing inode A's ioend work, it tries to
      grab inode A's i_mutex lock.  Since the i_mutex lock of inode A is
      still hold before the page fault happened, we enter a deadlock.
      Signed-off-by: default avatarJiaying Zhang <jiayingz@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      8c0bec21
  3. 22 Aug, 2011 6 commits
  4. 21 Aug, 2011 4 commits
  5. 20 Aug, 2011 4 commits
  6. 19 Aug, 2011 15 commits
    • Jiaying Zhang's avatar
      ext4: flush any pending end_io requests before DIO reads w/dioread_nolock · dccaf33f
      Jiaying Zhang authored
      There is a race between ext4 buffer write and direct_IO read with
      dioread_nolock mount option enabled. The problem is that we clear
      PageWriteback flag during end_io time but will do
      uninitialized-to-initialized extent conversion later with dioread_nolock.
      If an O_direct read request comes in during this period, ext4 will return
      zero instead of the recently written data.
      
      This patch checks whether there are any pending uninitialized-to-initialized
      extent conversion requests before doing O_direct read to close the race.
      Note that this is just a bandaid fix. The fundamental issue is that we
      clear PageWriteback flag before we really complete an IO, which is
      problem-prone. To fix the fundamental issue, we may need to implement an
      extent tree cache that we can use to look up pending to-be-converted extents.
      Signed-off-by: default avatarJiaying Zhang <jiayingz@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      dccaf33f
    • Jesse Barnes's avatar
      drm/i915: set GFX_MODE to pre-Ivybridge default value even on Ivybridge · b095cd0a
      Jesse Barnes authored
      Prior to Ivybridge, the GFX_MODE would default to 0x800, meaning that
      MI_FLUSH would flush the TLBs in addition to the rest of the caches
      indicated in the MI_FLUSH command.  However starting with Ivybridge, the
      register defaults to 0x2800 out of reset, meaning that to invalidate the
      TLB we need to use PIPE_CONTROL.  Since we're not doing that yet, go
      back to the old default so things work.
      
      v2: don't forget to actually *clear* the new bit
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: default avatarKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      b095cd0a
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 5ccc3874
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.dk/linux-block: (23 commits)
        Revert "cfq: Remove special treatment for metadata rqs."
        block: fix flush machinery for stacking drivers with differring flush flags
        block: improve rq_affinity placement
        blktrace: add FLUSH/FUA support
        Move some REQ flags to the common bio/request area
        allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
        xen/blkback: Make description more obvious.
        cfq-iosched: Add documentation about idling
        block: Make rq_affinity = 1 work as expected
        block: swim3: fix unterminated of_device_id table
        block/genhd.c: remove useless cast in diskstats_show()
        drivers/cdrom/cdrom.c: relax check on dvd manufacturer value
        drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
        bsg-lib: add module.h include
        cfq-iosched: Reduce linked group count upon group destruction
        blk-throttle: correctly determine sync bio
        loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other
        loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices
        loop: add management interface for on-demand device allocation
        loop: replace linked list of allocated devices with an idr index
        ...
      5ccc3874
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 · 0c3bef61
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
        PCI: OF: Don't crash when bridge parent is NULL.
        PCI: export pcie_bus_configure_settings symbol
        PCI: code and comments cleanup
        PCI: make cardbus-bridge resources optional
        PCI: make SRIOV resources optional
        PCI : ability to relocate assigned pci-resources
        PCI: honor child buses add_size in hot plug configuration
        PCI: Set PCI-E Max Payload Size on fabric
      0c3bef61
    • David Daney's avatar
      PCI: OF: Don't crash when bridge parent is NULL. · 69566dd8
      David Daney authored
      In pcibios_get_phb_of_node(), we will crash while booting if
      bus->bridge->parent is NULL.
      
      Check for this case and avoid dereferencing the NULL pointer.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      69566dd8
    • Jens Axboe's avatar
      Revert "cfq: Remove special treatment for metadata rqs." · b53d1ed7
      Jens Axboe authored
      We have a kernel build regression since 3.1-rc1, which is about 10%
      regression. The kernel source is in an ext3 filesystem.
      Alex Shi bisect it to commit:
      commit a07405b7
      Author: Justin TerAvest <teravest@google.com>
      Date:   Sun Jul 10 22:09:19 2011 +0200
      
          cfq: Remove special treatment for metadata rqs.
      
      Apparently this is caused by lack metadata preemption, where ext3/ext4
      do use READ_META. I didn't see a way to fix the issue, so suggest
      reverting the patch.
      
      This reverts commit a07405b7.
      
      Reported-by: Alex Shi<alex.shi@intel.com>
      Reported-by: Shaohua Li<shaohua.li@intel.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      b53d1ed7
    • Takashi Iwai's avatar
      ALSA: usb-audio - Fix missing mixer dB information · 38b65190
      Takashi Iwai authored
      The recent fix for testing dB range at the mixer creation time seems
      to cause regressions in some devices.  In such devices, reading the dB
      info at probing time gives an error, thus both dBmin and dBmax are still
      zero, and TLV flag isn't set although the later read of dB info succeeds.
      
      This patch adds a workaround for such a case by assuming that the later
      read will succeed.  In future, a similar test should be performed in a
      case where a wrong dB range is seen even in the later read.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Cc: <stable@kernel.org>
      38b65190
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 01b88335
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: fix array bounds error setting up PCIC NMI trap
      01b88335
    • Linus Torvalds's avatar
      Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev · 2c4ac99f
      Linus Torvalds authored
      * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
        drivers/ata/sata_dwc_460ex.c: add missing kfree
        ata: Add iMX pata support
        pata_via: disable ATAPI DMA on AVERATEC 3200
        [libata] sata_sil: fix used-uninit warning
      2c4ac99f
    • Linus Torvalds's avatar
      Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 35a21b42
      Linus Torvalds authored
      * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4.1: Return NFS4ERR_BADSESSION to callbacks during session resets
        NFSv4.1: Fix the callback 'highest_used_slotid' behaviour
        pnfs-obj: Fix the comp_index != 0 case
        pnfs-obj: Bug when we are running out of bio
        nfs: add missing prefetch.h include
      35a21b42
    • Ian Campbell's avatar
      sparc: fix array bounds error setting up PCIC NMI trap · 4a0342ca
      Ian Campbell authored
        CC      arch/sparc/kernel/pcic.o
      arch/sparc/kernel/pcic.c: In function 'pcic_probe':
      arch/sparc/kernel/pcic.c:359:33: error: array subscript is above array bounds [-Werror=array-bounds]
      arch/sparc/kernel/pcic.c:359:8: error: array subscript is above array bounds [-Werror=array-bounds]
      arch/sparc/kernel/pcic.c:360:33: error: array subscript is above array bounds [-Werror=array-bounds]
      arch/sparc/kernel/pcic.c:360:8: error: array subscript is above array bounds [-Werror=array-bounds]
      arch/sparc/kernel/pcic.c:361:33: error: array subscript is above array bounds [-Werror=array-bounds]
      arch/sparc/kernel/pcic.c:361:8: error: array subscript is above array bounds [-Werror=array-bounds]
      cc1: all warnings being treated as errors
      
      I'm not particularly familiar with sparc but t_nmi (defined in head_32.S via
      the TRAP_ENTRY macro) and pcic_nmi_trap_patch (defined in entry.S) both appear
      to be 4 instructions long and I presume from the usage that instructions are
      int sized.
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: sparclinux@vger.kernel.org
      Reviewed-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a0342ca
    • Julia Lawall's avatar
      drivers/ata/sata_dwc_460ex.c: add missing kfree · a081da63
      Julia Lawall authored
      Currently, error handling code in this function calls the function
      sata_dwc_port_stop, but this function has essentially no effect if hsdevp
      has not been stored in ap, which is the case throughout this function.  The
      only effect is to print a debugging message including ap->print_id.
      
      The code is rewritten to not call sata_dwc_port_stop, but instead to jump
      to a local label that prints the original error message and the print_id
      information.  In the case where hsdevp has been already allocated (but not
      yet stored in ap), this value is freed as well.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @exists@
      local idexpression x;
      statement S,S1;
      expression E;
      identifier fl;
      expression *ptr != NULL;
      @@
      
      x = \(kmalloc\|kzalloc\|kcalloc\)(...);
      ...
      if (x == NULL) S
      <... when != x
           when != if (...) { <+...kfree(x)...+> }
           when any
           when != true x == NULL
      x->fl
      ...>
      (
      if (x == NULL) S1
      |
      if (...) { ... when != x
                     when forall
      (
       return \(0\|<+...x...+>\|ptr\);
      |
      * return ...;
      )
      }
      )
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <julia@diku.dk>
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      a081da63
    • Arnaud Patard (Rtp)'s avatar
      ata: Add iMX pata support · e39c75cf
      Arnaud Patard (Rtp) authored
      Add basic support for pata on iMX. It has been tested only on imx51.
      SDMA support will probably be added later so this version supports only
      PIO.
      
      v2:
        - enable only when needed IORDY
        - use dev_get_drvdata
      v3:
        - add missing clk_put() calls
        - use platform_get_irq()
        - fix resume code to avoid disabling IORDY on resume
      v4:
        - Remove EXPERIMENTAL and switch to depends on ARCH_MXC
        - Use devm_kzalloc()
        - make clock a must-have
        - Use only 1 ioremap
      Signed-off-by: default avatarArnaud Patard <arnaud.patard@rtp-net.org>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      e39c75cf
    • Tejun Heo's avatar
      pata_via: disable ATAPI DMA on AVERATEC 3200 · 6d0e194d
      Tejun Heo authored
      On AVERATEC 3200, pata_via causes memory corruption with ATAPI DMA,
      which often leads to random kernel oops.  The cause of the problem is
      not well understood yet and only small subset of machines using the
      controller seem affected.  Blacklist ATAPI DMA on the machine.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=11426Reported-and-tested-by: default avatarJim Bray <jimsantelmo@gmail.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJeff Garzik <jgarzik@pobox.com>
      6d0e194d
    • Jeff Garzik's avatar
      [libata] sata_sil: fix used-uninit warning · ebd1699e
      Jeff Garzik authored
      Init 'serror' to silence the following warning:
      
      drivers/ata/sata_sil.c: In function ‘sil_interrupt’:
      drivers/ata/sata_sil.c:453:14: warning: ‘serror’ may be used uninitialized in
      this function [-Wuninitialized]
      
      This is not a 'can never happen' but is nonetheless extremely unlikely.
      The easiest and cleanest warning fix is simply to init the var,
      rather than worry about marking the var uninit-ok.
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      ebd1699e
  7. 18 Aug, 2011 1 commit
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · 5c80c71b
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
        Btrfs: set i_size properly when fallocating and we already
        btrfs: unlock on error in btrfs_file_llseek()
        btrfs: btrfs_permission's RO check shouldn't apply to device nodes
        Btrfs: truncate pages from clone ioctl target range
        Btrfs: fix uninitialized sync_pending
        Btrfs: fix wrong free space information
        btrfs: memory leak in btrfs_add_inode_defrag()
        Btrfs: use plain page_address() in header fields setget functions
        Btrfs: forced readonly when btrfs_drop_snapshot() fails
        Btrfs: check if there is enough space for balancing smarter
        Btrfs: fix a bug of balance on full multi-disk partitions
        Btrfs: fix an oops of log replay
        Btrfs: detect wether a device supports discard
        Btrfs: force unplugs when switching from high to regular priority bios
      5c80c71b