1. 14 Mar, 2011 18 commits
  2. 13 Mar, 2011 14 commits
  3. 12 Mar, 2011 1 commit
    • Chris Mason's avatar
      Btrfs: break out of shrink_delalloc earlier · 36e39c40
      Chris Mason authored
      Josef had changed shrink_delalloc to exit after three shrink
      attempts, which wasn't quite enough because new writers could
      race in and steal free space.
      
      But it also fixed deadlocks and stalls as we tried to recover
      delalloc reservations.  The code was tweaked to loop 1024
      times, and would reset the counter any time a small amount
      of progress was made.  This was too drastic, and with a
      lot of writers we can end up stuck in shrink_delalloc forever.
      
      The shrink_delalloc loop is fairly complex because the caller is looping
      too, and the caller will go ahead and force a transaction commit to make
      sure we reclaim space.
      
      This reworks things to exit shrink_delalloc when we've forced some
      writeback and the delalloc reservations have gone down.  This means
      the writeback has not just started but has also finished at
      least some of the metadata changes required to reclaim delalloc
      space.
      
      If we've got this wrong, we're returning ENOSPC too early, which
      is a big improvement over the current behavior of hanging the machine.
      
      Test 224 in xfstests hammers on this nicely, and with 1000 writers
      trying to fill a 1GB drive we get our first ENOSPC at 93% full.  The
      other writers are able to continue until we get 100%.
      
      This is a worst case test for btrfs because the 1000 writers are doing
      small IO, and the small FS size means we don't have a lot of room
      for metadata chunks.
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      36e39c40
  4. 11 Mar, 2011 7 commits
    • Chuck Lever's avatar
      NFS: NFSROOT should default to "proto=udp" · 53d47375
      Chuck Lever authored
      There have been a number of recent reports that NFSROOT is no longer
      working with default mount options, but fails only with certain NICs.
      
      Brian Downing <bdowning@lavos.net> bisected to commit 56463e50 "NFS:
      Use super.c for NFSROOT mount option parsing".  Among other things,
      this commit changes the default mount options for NFSROOT to use TCP
      instead of UDP as the underlying transport.
      
      TCP seems less able to deal with NICs that are slow to initialize.
      The system logs that have accompanied reports of problems all show
      that NFSROOT attempts to establish a TCP connection before the NIC is
      fully initialized, and thus the TCP connection attempt fails.
      
      When a TCP connection attempt fails during a mount operation, the
      NFS stack needs to fail the operation.  Usually user space knows how
      and when to retry it.  The network layer does not report a distinct
      error code for this particular failure mode.  Thus, there isn't a
      clean way for the RPC client to see that it needs to retry in this
      case, but not in others.
      
      Because NFSROOT is used in some environments where it is not possible
      to update the kernel command line to specify "udp", the proper thing
      to do is change NFSROOT to use UDP by default, as it did before commit
      56463e50.
      
      To make it easier to see how to change default mount options for
      NFSROOT and to distinguish default settings from mandatory settings,
      I've adjusted a couple of areas to document the specifics.
      
      root_nfs_cat() is also modified to deal with commas properly when
      concatenating strings containing mount option lists.  This keeps
      root_nfs_cat() call sites simpler, now that we may be concatenating
      multiple mount option strings.
      Tested-by: default avatarBrian Downing <bdowning@lavos.net>
      Tested-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Cc: <stable@kernel.org> # 2.6.37
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      53d47375
    • Huang Weiyi's avatar
      nfs4: remove duplicated #include · 57df216b
      Huang Weiyi authored
      Remove duplicated #include('s) in
        fs/nfs/nfs4proc.c
      Signed-off-by: default avatarHuang Weiyi <weiyi.huang@gmail.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      57df216b
    • Trond Myklebust's avatar
      NFSv4: nfs4_state_mark_reclaim_nograce() should be static · f9feab1e
      Trond Myklebust authored
      There are no more external users of nfs4_state_mark_reclaim_nograce() or
      nfs4_state_mark_reclaim_reboot(), so mark them as static.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      f9feab1e
    • Trond Myklebust's avatar
      ecac799a
    • Trond Myklebust's avatar
      NFSv4.1: Fix the handling of the SEQUENCE status bits · b4410c2f
      Trond Myklebust authored
      We want SEQUENCE status bits to be handled by the state manager in order
      to avoid threading issues.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      b4410c2f
    • Trond Myklebust's avatar
      NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses · 0400a6b0
      Trond Myklebust authored
      nfs4_schedule_state_recovery() should only be used when we need to force
      the state manager to check the lease. If we just want to start the
      state manager in order to handle a state recovery situation, we should be
      using nfs4_schedule_state_manager().
      
      This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
      its use with a set of helper functions that do the right thing.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      0400a6b0
    • Lukas Czerner's avatar
      block: fix mis-synchronisation in blkdev_issue_zeroout() · 0aeea189
      Lukas Czerner authored
      BZ29402
      https://bugzilla.kernel.org/show_bug.cgi?id=29402
      
      We can hit serious mis-synchronization in bio completion path of
      blkdev_issue_zeroout() leading to a panic.
      
      The problem is that when we are going to wait_for_completion() in
      blkdev_issue_zeroout() we check if the bb.done equals issued (number of
      submitted bios). If it does, we can skip the wait_for_completition()
      and just out of the function since there is nothing to wait for.
      However, there is a ordering problem because bio_batch_end_io() is
      calling atomic_inc(&bb->done) before complete(), hence it might seem to
      blkdev_issue_zeroout() that all bios has been completed and exit. At
      this point when bio_batch_end_io() is going to call complete(bb->wait),
      bb and wait does not longer exist since it was allocated on stack in
      blkdev_issue_zeroout() ==> panic!
      
      (thread 1)                      (thread 2)
      bio_batch_end_io()              blkdev_issue_zeroout()
        if(bb) {                      ...
          if (bb->end_io)             ...
            bb->end_io(bio, err);     ...
          atomic_inc(&bb->done);      ...
          ...                         while (issued != atomic_read(&bb.done))
          ...                         (let issued == bb.done)
          ...                         (do the rest of the function)
          ...                         return ret;
          complete(bb->wait);
          ^^^^^^^^
          panic
      
      We can fix this easily by simplifying bio_batch and completion counting.
      
      Also remove bio_end_io_t *end_io since it is not used.
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reported-by: default avatarEric Whitney <eric.whitney@hp.com>
      Tested-by: default avatarEric Whitney <eric.whitney@hp.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      0aeea189