Commit ea9ca07a authored by Matthew Sakai's avatar Matthew Sakai Committed by Mike Snitzer

dm vdo: add documentation details on zones and locking

Add details describing the vdo zone and thread model to the
documentation comments for major vdo components. Also added
some high-level description of the block map structure.
Signed-off-by: default avatarMatthew Sakai <msakai@redhat.com>
Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
parent 512039b4
...@@ -19,6 +19,21 @@ ...@@ -19,6 +19,21 @@
#include "vio.h" #include "vio.h"
#include "wait-queue.h" #include "wait-queue.h"
/*
* The block map is responsible for tracking all the logical to physical mappings of a VDO. It
* consists of a collection of 60 radix trees gradually allocated as logical addresses are used.
* Each tree is assigned to a logical zone such that it is easy to compute which zone must handle
* each logical address. Each logical zone also has a dedicated portion of the leaf page cache.
*
* Each logical zone has a single dedicated queue and thread for performing all updates to the
* radix trees assigned to that zone. The concurrency guarantees of this single-threaded model
* allow the code to omit more fine-grained locking for the block map structures.
*
* Load operations must be performed on the admin thread. Normal operations, such as reading and
* updating mappings, must be performed on the appropriate logical zone thread. Save operations
* must be launched from the same admin thread as the original load operation.
*/
enum { enum {
BLOCK_MAP_VIO_POOL_SIZE = 64, BLOCK_MAP_VIO_POOL_SIZE = 64,
}; };
......
...@@ -14,6 +14,11 @@ ...@@ -14,6 +14,11 @@
* deduplicate against a single block instead of being serialized through a PBN read lock. Only one * deduplicate against a single block instead of being serialized through a PBN read lock. Only one
* index query is needed for each hash_lock, instead of one for every data_vio. * index query is needed for each hash_lock, instead of one for every data_vio.
* *
* Hash_locks are assigned to hash_zones by computing a modulus on the hash itself. Each hash_zone
* has a single dedicated queue and thread for performing all operations on the hash_locks assigned
* to that zone. The concurrency guarantees of this single-threaded model allow the code to omit
* more fine-grained locking for the hash_lock structures.
*
* A hash_lock acts like a state machine perhaps more than as a lock. Other than the starting and * A hash_lock acts like a state machine perhaps more than as a lock. Other than the starting and
* ending states INITIALIZING and BYPASSING, every state represents and is held for the duration of * ending states INITIALIZING and BYPASSING, every state represents and is held for the duration of
* an asynchronous operation. All state transitions are performed on the thread of the hash_zone * an asynchronous operation. All state transitions are performed on the thread of the hash_zone
......
...@@ -26,6 +26,10 @@ ...@@ -26,6 +26,10 @@
* write amplification of writes by providing amortization of slab journal and block map page * write amplification of writes by providing amortization of slab journal and block map page
* updates. * updates.
* *
* The recovery journal has a single dedicated queue and thread for performing all journal updates.
* The concurrency guarantees of this single-threaded model allow the code to omit more
* fine-grained locking for recovery journal structures.
*
* The journal consists of a set of on-disk blocks arranged as a circular log with monotonically * The journal consists of a set of on-disk blocks arranged as a circular log with monotonically
* increasing sequence numbers. Three sequence numbers serve to define the active extent of the * increasing sequence numbers. Three sequence numbers serve to define the active extent of the
* journal. The 'head' is the oldest active block in the journal. The 'tail' is the end of the * journal. The 'head' is the oldest active block in the journal. The 'tail' is the end of the
......
...@@ -29,11 +29,17 @@ ...@@ -29,11 +29,17 @@
* a single array of slabs in order to eliminate the need for additional math in order to compute * a single array of slabs in order to eliminate the need for additional math in order to compute
* which physical zone a PBN is in. It also has a block_allocator per zone. * which physical zone a PBN is in. It also has a block_allocator per zone.
* *
* Load operations are required to be performed on a single thread. Normal operations are assumed * Each physical zone has a single dedicated queue and thread for performing all updates to the
* to be performed in the appropriate zone. Allocations and reference count updates must be done * slabs assigned to that zone. The concurrency guarantees of this single-threaded model allow the
* from the thread of their physical zone. Requests to commit slab journal tail blocks from the * code to omit more fine-grained locking for the various slab structures. Each physical zone
* recovery journal must be done on the journal zone thread. Save operations are required to be * maintains a separate copy of the slab summary to remove the need for explicit locking on that
* launched from the same thread as the original load operation. * structure as well.
*
* Load operations must be performed on the admin thread. Normal operations, such as allocations
* and reference count updates, must be performed on the appropriate physical zone thread. Requests
* from the recovery journal to commit slab journal tail blocks must be scheduled from the recovery
* journal thread to run on the appropriate physical zone thread. Save operations must be launched
* from the same admin thread as the original load operation.
*/ */
enum { enum {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment