1. 30 Nov, 2006 40 commits
    • Srinivasa Ds's avatar
      [GFS2] Mount problem with the GFS2 code · 0da3585e
      Srinivasa Ds authored
        While mounting the gfs2 filesystem,our test team had a problem and we
      got this error message.
      =======================================================
      
      GFS2: fsid=: Trying to join cluster "lock_nolock", "dasde1"
      GFS2: fsid=dasde1.0: Joined cluster. Now mounting FS...
      GFS2: not a GFS2 filesystem
      GFS2: fsid=dasde1.0: can't read superblock: -22
      
      ==========================================================================
      On debugging further we found that problem is while reading the super
      block(gfs2_read_super) and comparing the magic number in it.
      When I  replace the submit_bio() call(present in gfs2_read_super) with
      the sb_getblk() and ll_rw_block(), mount operation succeded.
      On further analysis we found that before calling submit_bio(),
      bio->bi_sector was set to "sector" variable. This "sector" variable has
      the same value of bh->b_blocknr(block number). Hence there is a need to
      multiply this valuwith (blocksize >> 9)(9 because,sector size
      2^9,samething happens in ll_rw_block also, before calling submit_bio()).
      So I have developed the patch which solves this problem. Please let me
      know your comments.
      ================================================================
      Signed-off-by: default avatarSrinivasa DS <srinivasa@in.ibm.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      0da3585e
    • Steven Whitehouse's avatar
      [GFS2] Remove gfs2_check_acl() · 77386e1f
      Steven Whitehouse authored
      As pointed out by Adrian Bunk, the gfs2_check_acl() function is no
      longer used. This patch removes it and renamed gfs2_check_acl_locked()
      to gfs2_check_acl() since we only need one variant of that function now.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      Cc: Adrian Bunk <bunk@stusta.de>
      77386e1f
    • Ryusuke Konishi's avatar
      [DLM] fix format warnings in rcom.c and recoverd.c · 57adf7ee
      Ryusuke Konishi authored
      This fixes the following gcc warnings generated on
      the architectures where uint64_t != unsigned long long (e.g. ppc64).
      
      fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t'
      fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t'
      fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
      fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
      fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
      Signed-off-by: default avatarRyusuke Konishi <ryusuke@osrg.net>
      Signed-off-by: default avatarPatrick Caulfield <pcaulfie@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      57adf7ee
    • Randy Dunlap's avatar
      [GFS2] lock function parameter · 0ac23069
      Randy Dunlap authored
      Fix function parameter typing:
      fs/gfs2/glock.c:100: warning: function declaration isn't a prototype
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      0ac23069
    • David Teigland's avatar
      [DLM] don't accept replies to old recovery messages · 98f176fb
      David Teigland authored
      We often abort a recovery after sending a status request to a remote node.
      We want to ignore any potential status reply we get from the remote node.
      If we get one of these unwanted replies, we've often moved on to the next
      recovery message and incremented the message sequence counter, so the
      reply will be ignored due to the seq number.  In some cases, we've not
      moved on to the next message so the seq number of the reply we want to
      ignore is still correct, causing the reply to be accepted.  The next
      recovery message will then mistake this old reply as a new one.
      
      To fix this, we add the flag RCOM_WAIT to indicate when we can accept a
      new reply.  We clear this flag if we abort recovery while waiting for a
      reply.  Before the flag is set again (to allow new replies) we know that
      any old replies will be rejected due to their sequence number.  We also
      initialize the recovery-message sequence number to a random value when a
      lockspace is first created.  This makes it clear when messages are being
      rejected from an old instance of a lockspace that has since been
      recreated.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      98f176fb
    • David Teigland's avatar
      [DLM] fix size of STATUS_REPLY message · 1babdb45
      David Teigland authored
      When the not_ready routine sends a "fake" status reply with blank status
      flags, it needs to use the correct size for a normal STATUS_REPLY by
      including the size of the would-be config parameters.  We also fill in the
      non-existant config parameters with an invalid lvblen value so it's easier
      to notice if these invalid paratmers are ever being used.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      1babdb45
    • Ryusuke Konishi's avatar
      [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning · aed3255f
      Ryusuke Konishi authored
      Fix a printk format warning in fs/gfs2/log.c:
      fs/gfs2/log.c:322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t'
      Signed-off-by: default avatarRyusuke Konishi <ryusuke@osrg.net>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      aed3255f
    • David Teigland's avatar
      [DLM] fix add_requestqueue checking nodes list · 2896ee37
      David Teigland authored
      Requests that arrive after recovery has started are saved in the
      requestqueue and processed after recovery is done.  Some of these requests
      are purged during recovery if they are from nodes that have been removed.
      We move the purging of the requests (dlm_purge_requestqueue) to later in
      the recovery sequence which allows the routine saving requests
      (dlm_add_requestqueue) to avoid filtering out requests by nodeid since the
      same will be done by the purge.  The current code has add_requestqueue
      filtering by nodeid but doesn't hold any locks when accessing the list of
      current nodes.  This also means that we need to call the purge routine
      when the lockspace is being shut down since the add routine will not be
      rejecting requests itself any more.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      2896ee37
    • Steven Whitehouse's avatar
      [GFS2] Fix recursive locking in gfs2_getattr · dcf3dd85
      Steven Whitehouse authored
      The readdirplus NFS operation can result in gfs2_getattr being
      called with the glock already held. In this case we do not want
      to try and grab the lock again.
      
      This fixes Red Hat bugzilla #215727
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      dcf3dd85
    • Steven Whitehouse's avatar
      [GFS2] Fix recursive locking in gfs2_permission · 300c7d75
      Steven Whitehouse authored
      Since gfs2_permission may be called either from the VFS (in which case
      we need to obtain a shared glock) or from GFS2 (in which case we already
      have a glock) we need to test to see whether or not a lock is required.
      The original test was buggy due to a potential race. This one should
      be safe.
      
      This fixes Red Hat bugzilla #217129
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      300c7d75
    • Steven Whitehouse's avatar
      [GFS2] Reduce number of arguments to meta_io.c:getbuf() · cb4c0313
      Steven Whitehouse authored
      Since the superblock and the address_space are determined by the
      glock, we might as well just pass that as the argument since all
      the callers already have that available.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      cb4c0313
    • Steven Whitehouse's avatar
      [GFS2] Move gfs2_meta_syncfs() into log.c · a25311c8
      Steven Whitehouse authored
      By moving gfs2_meta_syncfs() into log.c, gfs2_ail1_start()
      can be made static.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      a25311c8
    • Steven Whitehouse's avatar
      [GFS2] Fix journal flush problem · b004157a
      Steven Whitehouse authored
      This fixes a bug which resulted in poor performance due to flushing
      the journal too often. The code path in question was via the inode_go_sync()
      function in glops.c. The solution is not to flush the journal immediately
      when inodes are ejected from memory, but batch up the work for glockd to
      deal with later on. This means that glocks may now live on beyond the end of
      the lifetime of their inodes (but not very much longer in the normal case).
      
      Also fixed in this patch is a bug (which was hidden by the bug mentioned above) in
      calculation of the number of free journal blocks.
      
      The gfs2_logd process has been altered to be more responsive to the journal
      filling up. We now wake it up when the number of uncommitted journal blocks
      has reached the threshold level rather than trying to flush directly at the
      end of each transaction. This again means doing fewer, but larger, log
      flushes in general.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      b004157a
    • Steven Whitehouse's avatar
      [GFS2] mark_inode_dirty after write to stuffed file · ae619320
      Steven Whitehouse authored
      Writes to stuffed files were not being marked dirty correctly.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      ae619320
    • Steven Whitehouse's avatar
      [GFS2] Fix glock ordering on inode creation · 28626e20
      Steven Whitehouse authored
      The lock order here should be parent -> child rather than
      numeric order.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      28626e20
    • Steven Whitehouse's avatar
      [GFS2] Simplify glops functions · 1a14d3a6
      Steven Whitehouse authored
      The go_sync callback took two flags, but one of them was set on every
      call, so this patch removes once of the flags and makes the previously
      conditional operations (on this flag), unconditional.
      
      The go_inval callback took three flags, each of which was set on every
      call to it. This patch removes the flags and makes the operations
      unconditional, which makes the logic rather more obvious.
      
      Two now unused flags are also removed from incore.h.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      1a14d3a6
    • Steven Whitehouse's avatar
      [GFS2] Fix Kconfig wrt CRC32 · fa2ecfc5
      Steven Whitehouse authored
      GFS2 requires the CRC32 library function. This was reported by
      Toralf Förster.
      
      Cc: Toralf Förster <toralf.foerster@gmx.de>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      fa2ecfc5
    • Steven Whitehouse's avatar
      [GFS2] Make sentinel dirents compatible with gfs1 · 5e7d65cd
      Steven Whitehouse authored
      When deleting directory entries, we set the inum.no_addr to zero
      in a dirent when its the first dirent in a block and thus cannot
      be merged into the previous dirent as is the usual case. In gfs1,
      inum.no_formal_ino was used instead.
      
      This patch changes gfs2 to set both inum.no_addr and inum.no_formal_ino
      to zero. It also changes the test from just looking at inum.no_addr to
      look at both inum.no_addr and inum.no_formal_ino and a sentinel is
      now considered to be a dirent in which _either_ (or both) of them
      is set to zero.
      
      This resolves Red Hat bugzillas: #215809, #211465
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      5e7d65cd
    • Steven Whitehouse's avatar
      [GFS2] Remove unused function from inode.c · dcd24799
      Steven Whitehouse authored
      The gfs2_glock_nq_m_atime function is unused in so far as its only
      ever called with num_gh = 1, and this falls through to the
      gfs2_glock_nq_atime function, so we might as well call that directly.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      dcd24799
    • Steven Whitehouse's avatar
      [GFS2] Remove unused sysfs files · 175011cf
      Steven Whitehouse authored
      Four of the sysfs files are unused and can therefore be removed.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      175011cf
    • Steven Whitehouse's avatar
      [GFS2] Tidy up bmap & fix boundary bug · 4cf1ed81
      Steven Whitehouse authored
      This moves the locking for bmap into the bmap function itself
      rather than using a wrapper function. It also fixes a bug where
      the boundary flag was set on the wrong bh. Also the flags on
      the mapped bh are reset earlier in the function to ensure that
      they are 100% correct on the error path.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      4cf1ed81
    • Steven Whitehouse's avatar
      [GFS2] Fix memory allocation in glock.c · ab923031
      Steven Whitehouse authored
      Change from GFP_KERNEL to GFP_NOFS as this was causing a
      slow down when trying to push inodes from cache.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      ab923031
    • Patrick Caulfield's avatar
      [DLM] Fix DLM config · b98c95af
      Patrick Caulfield authored
      The attached patch fixes the DLM config so that it selects the chosen network
      transport. It should fix the bug where DLM can be left selected when NET gets
      unselected. This incorporates all the comments received about this patch.
      
      Cc: Adrian Bunk <bunk@stusta.de>
      Cc: Andrew Morton <akpm@osdl.org>
      Signed-Off-By: default avatarPatrick Caulfield <pcaulfie@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      b98c95af
    • David Teigland's avatar
      [DLM] clear sbflags on lock master · 6f90a8b1
      David Teigland authored
      RH BZ 211622
      
      The ALTMODE flag can be set in the lock master's copy of the lock but
      never cleared, so ALTMODE will also be returned in a subsequent conversion
      of the lock when it shouldn't be.  This results in lock_dlm incorrectly
      switching to the alternate lock mode when returning the result to gfs
      which then asserts when it sees the wrong lock state.  The fix is to
      propagate the cleared sbflags value to the master node when the lock is
      requested.  QA's d_rwrandirectlarge test triggers this bug very quickly.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      6f90a8b1
    • David Teigland's avatar
      [DLM] do full recover_locks barrier · 4b77f2c9
      David Teigland authored
      Red Hat BZ 211914
      
      The previous patch "[DLM] fix aborted recovery during
      node removal" was incomplete as discovered with further testing.  It set
      the bit for the RS_LOCKS barrier but did not then wait for the barrier.
      This is often ok, but sometimes it will cause yet another recovery hang.
      If it's a new node that also has the lowest nodeid that skips the barrier
      wait, then it misses the important step of collecting and reporting the
      barrier status from the other nodes (which is the job of the low nodeid in
      the barrier wait routine).
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      4b77f2c9
    • David Teigland's avatar
      [DLM] fix stopping unstarted recovery · 2cdc98aa
      David Teigland authored
      Red Hat BZ 211914
      
      When many nodes are joining a lockspace simultaneously, the dlm gets a
      quick sequence of stop/start events, a pair for adding each node.
      dlm_controld in user space sends dlm_recoverd in the kernel each stop and
      start event.  dlm_controld will sometimes send the stop before
      dlm_recoverd has had a chance to take up the previously queued start.  The
      stop aborts the processing of the previous start by setting the
      RECOVERY_STOP flag.  dlm_recoverd is erroneously clearing this flag and
      ignoring the stop/abort if it happens to take up the start after the stop
      meant to abort it.  The fix is to check the sequence number that's
      incremented for each stop/start before clearing the flag.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      2cdc98aa
    • David Teigland's avatar
      [DLM] fix aborted recovery during node removal · 91c0dc93
      David Teigland authored
      Red Hat BZ 211914
      
      With the new cluster infrastructure, dlm recovery for a node removal can
      be aborted and restarted for a node addition.  When this happens, the
      restarted recovery isn't aware that it's doing recovery for the earlier
      removal as well as the addition.  So, it then skips the recovery steps
      only required when nodes are removed.  This can result in locks not being
      purged for failed/removed nodes.  The fix is to check for removed nodes
      for which recovery has not been completed at the start of a new recovery
      sequence.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      91c0dc93
    • David Teigland's avatar
      [DLM] fix requestqueue race · d4400156
      David Teigland authored
      Red Hat BZ 211914
      
      There's a race between dlm_recoverd (1) enabling locking and (2) clearing
      out the requestqueue, and dlm_recvd (1) checking if locking is enabled and
      (2) adding a message to the requestqueue.  An order of recoverd(1),
      recvd(1), recvd(2), recoverd(2) will result in a message being left on the
      requestqueue.  The fix is to have dlm_recvd check if dlm_recoverd has
      enabled locking after taking the mutex for the requestqueue and if it has
      processing the message instead of queueing it.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      d4400156
    • David Teigland's avatar
      [DLM] status messages ping-pong between unmounted nodes · 435618b7
      David Teigland authored
      Red Hat BZ 213682
      
      If two nodes leave the lockspace (while unmounting the fs in the case of
      gfs) after one has sent a STATUS message to the other, STATUS/STATUS_REPLY
      messages will then ping-pong between the nodes when neither of them can
      find the lockspace in question any longer.  We kill this by not sending
      another STATUS message when we get a STATUS_REPLY for an unknown
      lockspace.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      435618b7
    • David Teigland's avatar
      [DLM] res_recover_locks_count not reset when recover_locks is aborted · 52069809
      David Teigland authored
      Red Hat BZ 213684
      
      If a node sends an lkb to the new master (RCOM_LOCK message) during
      recovery and recovery is then aborted on both nodes before it gets a
      reply, the res_recover_locks_count needs to be reset to 0 so that when the
      subsequent recovery comes along and sends the lkb to the new master again
      the assertion doesn't trigger that checks that counter is zero.
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      52069809
    • Patrick Caulfield's avatar
      [DLM] Add support for tcp communications · fdda387f
      Patrick Caulfield authored
      The following patch adds a TCP based communications layer
      to the DLM which is compile time selectable. The existing SCTP
      layer gives the advantage of allowing multihoming, whereas
      the TCP layer has been heavily tested in previous versions of
      the DLM and is known to be robust and therefore can be used as
      a baseline for performance testing.
      Signed-off-by: default avatarPatrick Caulfield <pcaulfie@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      fdda387f
    • Russell Cattelan's avatar
      [GFS2] Remove unused zero_readpage from stuffed_readpage · 61057c6b
      Russell Cattelan authored
      Stuffed files only consist of a maximum of
      (gfs2 block size - sizeof(struct gfs2_dinode)) bytes. Since the
      gfs2 block size is always less than page size, we will never see
      a call to stuffed_readpage for anything other than the first page
      in the file.
      Signed-off-by: default avatarRussell Cattelan <cattelan@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      61057c6b
    • Russell Cattelan's avatar
      [GFS2] Fix race in logging code · 70209331
      Russell Cattelan authored
      The log lock is dropped prior to io submittion, but
      this exposes a hole in which the log data structures
      may be going away due to a truncate.
      Store the buffer head in a local pointer prior to
      dropping the lock and relay on the buffer_head lock
      for consitency on the buffer head.
      Signed-Off-By: default avatarRussell Cattelan <cattelan@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      70209331
    • Steven Whitehouse's avatar
      [GFS2] Remove gfs2_inode_attr_in · 9e2dbdac
      Steven Whitehouse authored
      This function wasn't really doing the right thing. There was no need
      to update the inode size at this point and the updating of the
      i_blocks field has now been moved to the places where di_blocks is
      updated. A result of this patch and some those preceeding it is that
      unlocking a glock is now a much more efficient process, since there
      is no longer any requirement to copy data from the gfs2 inode into
      the vfs inode at this point.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      9e2dbdac
    • Steven Whitehouse's avatar
      [GFS2] Inode number is constant · e7c698d7
      Steven Whitehouse authored
      Since the inode number is constant, we don't need to keep updating
      it everytime we refresh the other inode fields.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      e7c698d7
    • Steven Whitehouse's avatar
      [GFS2] Only set inode flags when required · 6b124d8d
      Steven Whitehouse authored
      We were setting the inode flags from GFS2's flags far too often, even when they
      couldn't possibly have changed. This patch reduces the amount of flag
      setting going on so that we do it only when the inode is read in or
      when the flags have changed. The create case is covered by the "when
      the inode is read in" case.
      
      This also fixes a bug where we didn't set S_SYNC correctly.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      6b124d8d
    • Steven Whitehouse's avatar
      [GFS2] Fix page lock/glock deadlock · 2ca99501
      Steven Whitehouse authored
      This fixes a race between the glock and the page lock encountered
      during truncate in gfs2_readpage and gfs2_prepare_write. The gfs2_readpages
      function doesn't need the same fix since it only uses a try lock anyway, so
      it will fail back to gfs2_readpage in the case of a potential deadlock.
      
      This bug was spotted by Russell Cattelan.
      
      Cc: Russell Cattelan <cattelan@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      2ca99501
    • Steven Whitehouse's avatar
      [GFS2] Remove unused GL_DUMP flag · c594d886
      Steven Whitehouse authored
      There is no way to set the GL_DUMP flag, and in any case the
      same thing can be done with systemtap if required for debugging,
      so this removes it.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      c594d886
    • Steven Whitehouse's avatar
      [GFS2] Don't copy meta_header for rgrp in and out · f6e58f01
      Steven Whitehouse authored
      The meta_header for an ondisk rgrp never changes, so there is no point
      copying it in and back out to disk. Also there is no reason to keep
      a copy for each rgrp in memory.
      
      The code already checks to ensure that the header is correct before
      it calls the routine to copy the data in, so that we don't even need
      to check whether its correct on disk in the functions in ondisk.c
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      f6e58f01
    • Steven Whitehouse's avatar
      [GFS2] Tidy up 0 initialisations in inode.c · 294caaa3
      Steven Whitehouse authored
      We don't need to use endian conversions for 0 initialisations
      when creating a new on-disk inode.
      
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      294caaa3