1. 31 Jan, 2011 19 commits
  2. 29 Jan, 2011 5 commits
  3. 28 Jan, 2011 16 commits
    • bpm@sgi.com's avatar
      xfs: xfs_bmap_add_extent_delay_real should init br_startblock · 24446fc6
      bpm@sgi.com authored
      When filling in the middle of a previous delayed allocation in
      xfs_bmap_add_extent_delay_real, set br_startblock of the new delay
      extent to the right to nullstartblock instead of 0 before inserting
      the extent into the ifork (xfs_iext_insert), rather than setting
      br_startblock afterward.
      
      Adding the extent into the ifork with br_startblock=0 can lead to
      the extent being copied into the btree by xfs_bmap_extent_to_btree
      if we happen to convert from extents format to btree format before
      updating br_startblock with the correct value.  The unexpected
      addition of this delay extent to the btree can cause subsequent
      XFS_WANT_CORRUPTED_GOTO filesystem shutdown in several
      xfs_bmap_add_extent_delay_real cases where we are converting a delay
      extent to real and unexpectedly find an extent already inserted.
      For example:
      
      911         case BMAP_LEFT_FILLING:
      912                 /*
      913                  * Filling in the first part of a previous delayed allocation.
      914                  * The left neighbor is not contiguous.
      915                  */
      916                 trace_xfs_bmap_pre_update(ip, idx, state, _THIS_IP_);
      917                 xfs_bmbt_set_startoff(ep, new_endoff);
      918                 temp = PREV.br_blockcount - new->br_blockcount;
      919                 xfs_bmbt_set_blockcount(ep, temp);
      920                 xfs_iext_insert(ip, idx, 1, new, state);
      921                 ip->i_df.if_lastex = idx;
      922                 ip->i_d.di_nextents++;
      923                 if (cur == NULL)
      924                         rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
      925                 else {
      926                         rval = XFS_ILOG_CORE;
      927                         if ((error = xfs_bmbt_lookup_eq(cur, new->br_startoff,
      928                                         new->br_startblock, new->br_blockcount,
      929                                         &i)))
      930                                 goto done;
      931                         XFS_WANT_CORRUPTED_GOTO(i == 0, done);
      
      With the bogus extent in the btree we shutdown the filesystem at
      931.  The conversion from extents to btree format happens when the
      number of extents in the inode increases above ip->i_df.if_ext_max.
      xfs_bmap_extent_to_btree copies extents from the ifork into the
      btree, ignoring all delalloc extents which are denoted by
      br_startblock having some value of nullstartblock.
      
      SGI-PV: 1013221
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      24446fc6
    • Dave Chinner's avatar
      xfs: fix dquot shaker deadlock · 0fbca4d1
      Dave Chinner authored
      Commit 368e1361 ("xfs: remove duplicate code from dquot reclaim") fails
      to unlock the dquot freelist when the number of loop restarts is
      exceeded in xfs_qm_dqreclaim_one(). This causes hangs in memory
      reclaim.
      
      Rework the loop control logic into an unwind stack that all the
      different cases jump into. This means there is only one set of code
      that processes the loop exit criteria, and simplifies the unlocking
      of all the items from different points in the loop. It also fixes a
      double increment of the restart counter from the qi_dqlist_lock
      case.
      Reported-by: default avatarMalcolm Scott <lkml@malc.org.uk>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      0fbca4d1
    • Dave Chinner's avatar
      xfs: handle CIl transaction commit failures correctly · c6f990d1
      Dave Chinner authored
      Failure to commit a transaction into the CIL is not handled
      correctly. This currently can only happen when racing with a
      shutdown and requires an explicit shutdown check, so it rare and can
      be avoided. Remove the shutdown check and make the CIL commit a void
      function to indicate it will always succeed, thereby removing the
      incorrectly handled failure case.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      c6f990d1
    • Dave Chinner's avatar
      xfs: limit extsize to size of AGs and/or MAXEXTLEN · 5315837d
      Dave Chinner authored
      The extent size hint can be set to larger than an AG. This means
      that the alignment process can push the range to be allocated
      outside the bounds of the AG, resulting in assert failures or
      corrupted bmbt records. Similarly, if the extsize is larger than the
      maximum extent size supported, the alignment process will produce
      extents that are too large to fit into the bmbt records, resulting
      in a different type of assert/corruption failure.
      
      Fix this by limiting extsize at the time іt is set firstly to be
      less than MAXEXTLEN, then to be a maximum of half the size of the
      AGs in the filesystem for non-realtime inodes. Realtime inodes do
      not allocate out of AGs, so don't have to be restricted by the size
      of AGs.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      5315837d
    • Dave Chinner's avatar
      xfs: prevent extsize alignment from exceeding maximum extent size · 4ce15989
      Dave Chinner authored
      When doing delayed allocation, if the allocation size is for a
      maximally sized extent, extent size alignment can push it over this
      limit. This results in an assert failure in xfs_bmbt_set_allf() as
      the extent length is too large to find in the extent record.
      
      Fix this by ensuring that we allow for space that extent size
      alignment requires (up to 2 * (extsize -1) blocks as we have to
      handle both head and tail alignment) when limiting the maximum size
      of the extent.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      4ce15989
    • Dave Chinner's avatar
      xfs: limit extent length for allocation to AG size · 14b064ce
      Dave Chinner authored
      Delayed allocation extents can be larger than AGs, so when trying to
      convert a large range we may scan every AG inside
      xfs_bmap_alloc_nullfb() trying to find an AG with a size larger than
      an AG. We should stop when we find the first AG with a maximum
      possible allocation size. This causes excessive CPU usage when there
      are lots of AGs.
      
      The same problem occurs when doing preallocation of a range larger
      than an AG.
      
      Fix the problem by limiting real allocation lengths to the maximum
      that an AG can support. This means if we have empty AGs, we'll stop
      the search at the first of them. If there are no empty AGs, we'll
      still scan them all, but that is a different problem....
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      14b064ce
    • Dave Chinner's avatar
      xfs: speculative delayed allocation uses rounddown_power_of_2 badly · b8fc8263
      Dave Chinner authored
      rounddown_power_of_2() returns an undefined result when passed a
      value of zero. The specualtive delayed allocation code is doing this
      when the inode is zero length. Hence occasionally the preallocation
      is much, much larger than is necessary (e.g. 8GB for a 270 _byte_
      file). Ensure we don't even pass a zero value to this function so
      the result of preallocation is always the desired size.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      b8fc8263
    • Dave Chinner's avatar
      xfs: fix efi item leak on forced shutdown · e34a314c
      Dave Chinner authored
      After test 139, kmemleak shows:
      
      unreferenced object 0xffff880078b405d8 (size 400):
        comm "xfs_io", pid 4904, jiffies 4294909383 (age 1186.728s)
        hex dump (first 32 bytes):
          60 c1 17 79 00 88 ff ff 60 c1 17 79 00 88 ff ff  `..y....`..y....
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff81afb04d>] kmemleak_alloc+0x2d/0x60
          [<ffffffff8115c6cf>] kmem_cache_alloc+0x13f/0x2b0
          [<ffffffff814aaa97>] kmem_zone_alloc+0x77/0xf0
          [<ffffffff814aab2e>] kmem_zone_zalloc+0x1e/0x50
          [<ffffffff8147cd6b>] xfs_efi_init+0x4b/0xb0
          [<ffffffff814a4ee8>] xfs_trans_get_efi+0x58/0x90
          [<ffffffff81455fab>] xfs_bmap_finish+0x8b/0x1d0
          [<ffffffff814851b4>] xfs_itruncate_finish+0x2c4/0x5d0
          [<ffffffff814a970f>] xfs_setattr+0x8df/0xa70
          [<ffffffff814b5c7b>] xfs_vn_setattr+0x1b/0x20
          [<ffffffff8117dc00>] notify_change+0x170/0x2e0
          [<ffffffff81163bf6>] do_truncate+0x66/0xa0
          [<ffffffff81163d0b>] sys_ftruncate+0xdb/0xe0
          [<ffffffff8103a002>] system_call_fastpath+0x16/0x1b
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      The cause of the leak is that the "remove" parameter of IOP_UNPIN()
      is never set when a CIL push is aborted. This means that the EFI
      item is never freed if it was in the push being cancelled. The
      problem is specific to delayed logging, but has uncovered a couple
      of problems with the handling of IOP_UNPIN(remove).
      
      Firstly, we cannot safely call xfs_trans_del_item() from IOP_UNPIN()
      in the CIL commit failure path or the iclog write failure path
      because for delayed loging we have no transaction context. Hence we
      must only call xfs_trans_del_item() if the log item being unpinned
      has an active log item descriptor.
      
      Secondly, xfs_trans_uncommit() does not handle log item descriptor
      freeing during the traversal of log items on a transaction. It can
      reference a freed log item descriptor when unpinning an EFI item.
      Hence it needs to use a safe list traversal method to allow items to
      be removed from the transaction during IOP_UNPIN().
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      e34a314c
    • Jeff Garzik's avatar
    • Tejun Heo's avatar
      libata: set queue DMA alignment to sector size for ATAPI too · 729a6a30
      Tejun Heo authored
      ata_pio_sectors() expects buffer for each sector to be contained in a
      single page; otherwise, it ends up overrunning the first page.  This
      is achieved by setting queue DMA alignment.  If sector_size is smaller
      than PAGE_SIZE and all buffers are sector_size aligned, buffer for
      each sector is always contained in a single page.
      
      This wasn't applied to ATAPI devices but IDENTIFY_PACKET is executed
      as ATA_PROT_PIO and thus uses ata_pio_sectors().  Newer versions of
      udev issue IDENTIFY_PACKET with unaligned buffer triggering the
      problem and causing oops.
      
      This patch fixes the problem by setting sdev->sector_size to
      ATA_SECT_SIZE on ATATPI devices and always setting DMA alignment to
      sector_size.  While at it, add a warning for the unlikely but still
      possible scenario where sector_size is larger than PAGE_SIZE, in which
      case the alignment wouldn't be enough.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJohn Stanley <jpsinthemix@verizon.net>
      Tested-by: default avatarJohn Stanley <jpsinthemix@verizon.net>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      729a6a30
    • Francesco Antonacci's avatar
      libata: DVR-212D can't do SETXFER DVD-RW DVR-212D · 4a5610a0
      Francesco Antonacci authored
      PIONEER DVR-212D can't do SETXFER like its sibling DVRTD08.  Add
      ATA_HORKAGE_NOSETXFER for it.  Reported in bko#27502.
      
        https://bugzilla.kernel.org/show_bug.cgi?id=27502Signed-off-by: default avatarFrancesco Antonacci <fraanto@gmail.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      4a5610a0
    • Anssi Hannula's avatar
      ahci: add HFLAG_YES_FBS and apply it to 88SE9128 · 10aca06c
      Anssi Hannula authored
      Commit 5f173107 added HFLAG_YES_FBS workaround for 88SE9128
      (1b4b:9123).
      
      However, that change inadvertently caused the legacy IDE interface of
      the controller (with the same pci id) to become associated with the AHCI
      driver as well, causing the driver to try to bring the interface up in
      vain.
      
      Fix that by matching against class as well.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@iki.fi>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      10aca06c
    • Sergei Shtylyov's avatar
      pata_hpt37x: inherit prereset() method for HPT374 · defed559
      Sergei Shtylyov authored
      Commit ab81a505 (pata_hpt37x: unify ->pre_reset
      methods) neglected to remove the initializer for the prereset() method from
      'hpt374_fn1_port_ops' (it's inherited from 'hpt372_port_ops' anyway), as well
      as to update the comment in hpt37x_init_one()...
      Signed-off-by: default avatarSergei Shtylyov <sshtylyov@ru.mvista.com>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      defed559
    • Seth Heasley's avatar
      ahci: AHCI mode SATA patch for Intel DH89xxCC DeviceIDs · a4a461a6
      Seth Heasley authored
      This patch adds the AHCI-mode SATA DeviceID for the Intel DH89xxCC PCH.
      Signed-off-by: default avatarSeth Heasley <seth.heasley@intel.com>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      a4a461a6
    • Sergei Shtylyov's avatar
      pata_hpt37x: fold 'if' statement into 'switch' · 910f7bb1
      Sergei Shtylyov authored
      hpt37x_init_one() has a large *if* statement which should really be folded into
      the *switch* statement that currently constitutes its *else* branch, reducing
      one level of indentation...
      Signed-off-by: default avatarSergei Shtylyov <sshtylyov@ru.mvista.com>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      910f7bb1
    • Sergei Shtylyov's avatar
      pata_hpt{37x|3x2n}: use pr_*(DRV_NAME ...) instead of printk(KERN_* ...) · 40d69ba0
      Sergei Shtylyov authored
      ... the same as the 'pata_hpt366' driver does.
      Signed-off-by: default avatarSergei Shtylyov <sshtylyov@ru.mvista.com>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      40d69ba0