Commits · 7af8f1e4aa86720840d3318e4dc225c3c7e5a6d0 · Kirill Smelkov / linux

01 Mar, 2010 3 commits

ceph: include migrating caps in issued set · 7af8f1e4

Sage Weil authored 15 years ago


We should include caps that are mid-migration (we've received the EXPORT,
but not the IMPORT) in the issued caps set.
Signed-off-by: Sage Weil <sage@newdream.net>

7af8f1e4

ceph: return EBADF if waiting for caps on closed file · 195d3ce2

Sage Weil authored 15 years ago


Verify the file is actually open for the given caps when we are
waiting for caps.  This ensures we will wake up and return EBADF
if another thread closes the file out from under us.

Note that EBADF is also the correct return code from write(2)
when called on a file handle opened for reading (although the
vfs should catch that).
Signed-off-by: Sage Weil <sage@newdream.net>

195d3ce2

ceph: fix snaptrace decoding on cap migration between mds · 70edb55b

Sage Weil authored 15 years ago


This was simply broken.  Apparently at some point we thought about putting
the snaptrace in the middle section, but didn't.
Signed-off-by: Sage Weil <sage@newdream.net>

70edb55b

23 Feb, 2010 2 commits

ceph: drop messages on unregistered mds sessions; cleanup · 2600d2dd

Sage Weil authored 15 years ago


Verify the mds session is currently registered before handling
incoming messages.  Clean up message handlers to pull mds out
of session->s_mds instead of less trustworthy src field.

Clean up con_{get,put} debug output.
Signed-off-by: Sage Weil <sage@newdream.net>

2600d2dd

ceph: fix comments, locking in destroy_inode · a6369741

Sage Weil authored 15 years ago


The destroy_inode path needs no inode locks since there are no
inode references.  Update __ceph_remove_cap comment to reflect
that it is called without cap->session->s_mutex in this case.
Signed-off-by: Sage Weil <sage@newdream.net>

a6369741

19 Feb, 2010 2 commits

ceph: cleanup redundant code in handle_cap_grant · bcd2cbd1

Yehuda Sadeh authored 15 years ago


There is no state in local vars that requires us to loop after temporarily
dropping i_lock.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>

bcd2cbd1

ceph: fix check for invalidate_mapping_pages success · 5ecad6fd

Sage Weil authored 15 years ago


We need to know whether there was any page left behind, and not the
return value (the total number of pages invalidated).  Look at the mapping
to see if we were successful or not.

Move it all into a helper to simplify the two callers.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>

5ecad6fd

17 Feb, 2010 2 commits

ceph: fix iterate_caps removal race · 7c1332b8

Sage Weil authored 15 years ago


We need to be able to iterate over all caps on a session with a
possibly slow callback on each cap.  To allow this, we used to
prevent cap reordering while we were iterating.  However, we were
not safe from races with removal: removing the 'next' cap would
make the next pointer from list_for_each_entry_safe be invalid,
and cause a lock up or similar badness.

Instead, we keep an iterator pointer in the session pointing to
the current cap.  As before, we avoid reordering.  For removal,
if the cap isn't the current cap we are iterating over, we are
fine.  If it is, we clear cap->ci (to mark the cap as pending
removal) but leave it in the session list.  In iterate_caps, we
can safely finish removal and get the next cap pointer.

While we're at it, clean up put_cap to not take a cap reservation
context, as it was never used.
Signed-off-by: Sage Weil <sage@newdream.net>

7c1332b8

ceph: clean up readdir caps reservation · 85ccce43

Sage Weil authored 15 years ago


Use a global counter for the minimum number of allocated caps instead of
hard coding a check against readdir_max.  This takes into account multiple
client instances, and avoids examining the superblock mount options when a
cap is dropped.
Signed-off-by: Sage Weil <sage@newdream.net>

85ccce43

11 Feb, 2010 5 commits

ceph: remove bogus invalidate_mapping_pages · 80310491

Sage Weil authored 15 years ago


We were invalidating mapping pages when dropping FILE_CACHE in
__send_cap().  But ceph_check_caps attempts to invalidate already, and
also checks for success, so we should never get to this point.
Signed-off-by: Sage Weil <sage@newdream.net>

80310491

ceph: invalidate pages even if truncate is pending · 0840d8af

Sage Weil authored 15 years ago


There is no reason not to invalidate pages when a truncate is pending.
Both throw out page cache pages.
Signed-off-by: Sage Weil <sage@newdream.net>

0840d8af

ceph: cleanup async writeback, truncation, invalidate helpers · 3c6f6b79

Sage Weil authored 15 years ago


Grab inode ref in helper.  Make work functions static, with consistent
naming.
Signed-off-by: Sage Weil <sage@newdream.net>

3c6f6b79

ceph: do not retain caps that are being revoked · 68c28323

Sage Weil authored 15 years ago


Never retain caps in __send_cap() that are being revoked.
Signed-off-by: Sage Weil <sage@newdream.net>

68c28323

ceph: cap revocation fixes · cbd03635

Sage Weil authored 15 years ago


Try to invalidate pages in ceph_check_caps() if FILE_CACHE is being
revoked.  If we fail, queue an immediate async invalidate if FILE_CACHE
is being revoked.  (If it's not being revoked, we just queue the caps
for later evaluation later, as per the old behavior.)
Signed-off-by: Sage Weil <sage@newdream.net>

cbd03635

23 Dec, 2009 2 commits

ceph: include transaction id in ceph_msg_header (protocol change) · 6df058c0

Sage Weil authored 15 years ago


Many (most?) message types include a transaction id.  By including it in
the fixed size header, we always have it available even when we are unable
to allocate memory for the (larger, variable sized) message body.  This
will allow us to error out the appropriate request instead of (silently)
dropping the reply.
Signed-off-by: Sage Weil <sage@newdream.net>

6df058c0

ceph: do not touch_caps while iterating over caps list · 5dacf091

Sage Weil authored 15 years ago


Avoid confusing iterate_session_caps(), flag the session while we are
iterating so that __touch_cap does not rearrange items on the list.

All other modifiers of session->s_caps do so under the protection of
s_mutex.
Signed-off-by: Sage Weil <sage@newdream.net>

5dacf091

22 Dec, 2009 1 commit

ceph: hex dump corrupt server data to KERN_DEBUG · 9ec7cab1

Sage Weil authored 15 years ago


Also, print fsid using standard format, NOT hex dump.
Signed-off-by: Sage Weil <sage@newdream.net>

9ec7cab1

03 Dec, 2009 1 commit
- ceph: whitespace cleanup · 50b885b9
  Sage Weil authored 15 years ago
```
Signed-off-by: Sage Weil <sage@newdream.net>
```
  50b885b9
12 Nov, 2009 1 commit

ceph: fix page invalidation deadlock · 11ea8eda

Sage Weil authored 15 years ago


We occasionally want to make a best-effort attempt to invalidate cache
pages without fear of blocking.  If this fails, we fall back to an async
invalidate in another thread.

Use invalidate_mapping_pages instead of invalidate_inode_page2, as that
will skip locked pages, and not deadlock.
Signed-off-by: Sage Weil <sage@newdream.net>

11ea8eda

11 Nov, 2009 1 commit

ceph: remove recon_gen logic · cdac8303

Sage Weil authored 15 years ago


We don't get an explicit affirmative confirmation that our caps reconnect,
nor do we necessarily want to pay that cost.  So, take all this code out
for now.
Signed-off-by: Sage Weil <sage@newdream.net>

cdac8303

09 Nov, 2009 1 commit

ceph: do not confuse stale and dead (unreconnected) caps · 685f9a5d

Sage Weil authored 15 years ago


We were using the cap_gen to track both stale caps (caps that timed out
due to temporarily losing touch with the mds) and dead caps that did not
reconnect after an MDS failure.  Introduce a recon_gen counter to track
reconnections to restarted MDSs and kill dead caps based on that instead.

Rename gen to cap_gen while we're at it to make it more clear which is
which.
Signed-off-by: Sage Weil <sage@newdream.net>

685f9a5d

27 Oct, 2009 1 commit

ceph: allocate and parse mount args before client instance · 6b805185

Sage Weil authored 15 years ago


This simplifies much of the error handling during mount.  It also means
that we have the mount args before client creation, and we can initialize
based on those options.
Signed-off-by: Sage Weil <sage@newdream.net>

6b805185

16 Oct, 2009 2 commits

ceph: move dirty caps code around · 76e3b390
Sage Weil authored 15 years ago
```
Cleanup only.
Signed-off-by: Sage Weil <sage@newdream.net>
```
76e3b390

ceph: flush dirty caps via the cap_dirty list · afcdaea3

Sage Weil authored 15 years ago


Previously we were flushing dirty caps by passing an extra flag
when traversing the delayed caps list.  Besides being a bit ugly,
that can also miss caps that are dirty but didn't result in a
cap requeue: notably, mark_caps_dirty().

Separate the flushing into a separate helper, and traverse the
cap_dirty list.

This also brings i_dirty_item in line with i_dirty_caps: we are
on the list IFF caps != 0.  We carry an inode ref IFF
dirty_caps|flushing_caps != 0.

Lose the unused return value from __ceph_mark_caps_dirty().
Signed-off-by: Sage Weil <sage@newdream.net>

afcdaea3

14 Oct, 2009 1 commit

ceph: move generic flushing code into helper · cdc35f96

Sage Weil authored 15 years ago


Both callers of __mark_caps_flushing() do the same work; move it
into the helper.
Signed-off-by: Sage Weil <sage@newdream.net>

cdc35f96

06 Oct, 2009 1 commit

ceph: capability management · a8599bd8

Sage Weil authored 15 years ago


The Ceph metadata servers control client access to inode metadata and
file data by issuing capabilities, granting clients permission to read
and/or write both inode field and file data to OSDs (storage nodes).
Each capability consists of a set of bits indicating which operations
are allowed.

If the client holds a *_SHARED cap, the client has a coherent value
that can be safely read from the cached inode.

In the case of a *_EXCL (exclusive) or FILE_WR capabilities, the client
is allowed to change inode attributes (e.g., file size, mtime), note
its dirty state in the ceph_cap, and asynchronously flush that
metadata change to the MDS.

In the event of a conflicting operation (perhaps by another client),
the MDS will revoke the conflicting client capabilities.

In order for a client to cache an inode, it must hold a capability
with at least one MDS server.  When inodes are released, release
notifications are batched and periodically sent en masse to the MDS
cluster to release server state.
Signed-off-by: Sage Weil <sage@newdream.net>

a8599bd8