Commit fa1f9939 authored by unknown's avatar unknown

comments


storage/maria/tablockman.c:
  comments. bugfix - a special case in release_locks
storage/maria/unittest/lockman1-t.c:
  updated
storage/maria/unittest/lockman2-t.c:
  new tests
parent c11eccf9
// TODO - allocate everything from dynarrays !!! (benchmark)
// TODO instant duration locks
// automatically place S instead of LS if possible
/* Copyright (C) 2006 MySQL AB
......@@ -23,6 +22,93 @@
#include <lf.h>
#include "tablockman.h"
/*
Lock Manager for Table Locks
The code below handles locks on resources - but it is optimized for a
case when a number of resources is not very large, and there are many of
locks per resource - that is a resource is likely to be a table or a
database, but hardly a row in a table.
Locks belong to "lock owners". A Lock Owner is uniquely identified by a
16-bit number - loid (lock owner identifier). A function loid_to_tlo must
be provided by the application that takes such a number as an argument
and returns a TABLE_LOCK_OWNER structure.
Lock levels are completely defined by three tables. Lock compatibility
matrix specifies which locks can be held at the same time on a resource.
Lock combining matrix specifies what lock level has the same behaviour as
a pair of two locks of given levels. getlock_result matrix simplifies
intention locking and lock escalation for an application, basically it
defines which locks are intention locks and which locks are "loose"
locks. It is only used to provide better diagnostics for the
application, lock manager itself does not differentiate between normal,
intention, and loose locks.
The assumptions are: few distinct resources, many locks are held at the
same time on one resource. Thus: a lock structure _per resource_ can be
rather large; a lock structure _per lock_ does not need to be very small
either; we need to optimize for _speed_. Operations we need are: place a
lock, check if a particular transaction already has a lock on this
resource, check if a conflicting lock exists, if yes - find who owns it.
Solution: every resource has a structure with
1. Hash of "active" (see below for the description of "active") granted
locks with loid as a key. Thus, checking if a given transaction has a
lock on this resource is O(1) operation.
2. Doubly-linked lists of all granted locks - one list for every lock
type. Thus, checking if a conflicting lock exists is a check whether
an appropriate list head pointer is not null, also O(1).
3. Every lock has a loid of the owner, thus checking who owns a
conflicting lock is also O(1).
4. Deque of waiting locks. It's a deque not a fifo, because for lock
upgrades requests are added to the queue head, not tail. There's never
a need to scan the queue.
Result: adding or removing a lock is always a O(1) operation, it does not
depend on the number of locks on the resource, or number of transactions,
or number of resources. It _does_ depend on the number of different lock
levels - O(number_of_lock_levels) - but it's a constant.
Waiting: if there is a conflicting lock or if wait queue is not empty, a
requested lock cannot be granted at once. It is added to the end of the
wait queue. If there is a conflicting lock - the "blocker" transaction is
the owner of this lock. If there's no conflict but a queue was not empty,
than the "blocker" is the transaction that the owner of the lock at the
end of the queue is waiting for (in other words, our lock is added to the
end of the wait queue, and our blocker is the same as of the lock right
before us).
Lock upgrades: when a thread that has a lock on a given resource,
requests a new lock on the same resource and the old lock is not enough
to satisfy new lock requirements (which is defined by
lock_combining_matrix[old_lock][new_lock] != old_lock), a new lock
(defineded by lock_combining_matrix as above) is placed. Depending on
other granted locks it is immediately active or it has to wait. Here the
lock is added to the start of the waiting queue, not to the end. Old
lock, is removed from the hash, but not from the doubly-linked lists.
(indeed, a transaction checks "do I have a lock on this resource ?" by
looking in a hash, and it should find a latest lock, so old locks must be
removed; but a transaction checks "are the conflicting locks ?" by
checking doubly-linked lists, it doesn't matter if it will find an old
lock - if it would be removed, a new lock would be also a conflict).
To better support table-row relations where one needs to lock the table
with an intention lock before locking the row, extended diagnostics is
provided. When an intention lock (presumably on a table) is granted,
lockman_getlock() returns one of GOT_THE_LOCK (no need to lock the row,
perhaps the thread already has a normal lock on this table),
GOT_THE_LOCK_NEED_TO_LOCK_A_SUBRESOURCE (need to lock the row, as usual),
GOT_THE_LOCK_NEED_TO_INSTANT_LOCK_A_SUBRESOURCE (only need to check
whether it's possible to lock the row, but no need to lock it - perhaps
the thread has a loose lock on this table). This is defined by
getlock_result[] table.
Instant duration locks are not supported. Though they're trivial to add,
they are normally only used on rows, not on tables. So, presumably,
they are not needed here.
*/
/*
Lock compatibility matrix.
......@@ -121,28 +207,63 @@ struct st_table_lock {
};
#define hash_insert my_hash_insert /* for consistency :) */
#define remove_from_wait_queue(LOCK, TABLE) \
do \
{ \
if ((LOCK)->prev) \
{ \
DBUG_ASSERT((TABLE)->wait_queue_out != (LOCK)); \
(LOCK)->prev->next= (LOCK)->next; \
} \
else \
{ \
DBUG_ASSERT((TABLE)->wait_queue_out == (LOCK)); \
(TABLE)->wait_queue_out= (LOCK)->next; \
} \
if ((LOCK)->next) \
{ \
DBUG_ASSERT((TABLE)->wait_queue_in != (LOCK)); \
(LOCK)->next->prev= (LOCK)->prev; \
} \
else \
{ \
DBUG_ASSERT((TABLE)->wait_queue_in == (LOCK)); \
(TABLE)->wait_queue_in= (LOCK)->prev; \
} \
} while (0)
/*
DESCRIPTION
tries to lock a resource 'table' with a lock level 'lock'.
RETURN
see enum lockman_getlock_result
*/
enum lockman_getlock_result
tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, enum lock_type lock)
tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo,
LOCKED_TABLE *table, enum lock_type lock)
{
TABLE_LOCK *old, *new, *blocker;
TABLE_LOCK_OWNER *wait_for;
int i;
ulonglong deadline;
struct timespec timeout;
enum lock_type new_lock;
int i;
pthread_mutex_lock(& table->mutex);
/* do we alreasy have a lock on this resource ? */
old= (TABLE_LOCK *)hash_search(& table->active, (byte *)&lo->loid,
sizeof(lo->loid));
/* perhaps we have the lock already ? */
/* and if yes, is it enough to satisfy the new request */
if (old && lock_combining_matrix[old->lock_type][lock] == old->lock_type)
{
/* yes */
pthread_mutex_unlock(& table->mutex);
return getlock_result[old->lock_type][lock];
}
/* no, placing a new lock. first - take a free lock structure from the pool */
pthread_mutex_lock(& lm->pool_mutex);
new= lm->pool;
if (new)
......@@ -161,25 +282,28 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
}
}
/* calculate required upgraded lock type */
/* calculate the level of the upgraded lock */
new_lock= old ? lock_combining_matrix[old->lock_type][lock] : lock;
new->loid= lo->loid;
new->lock_type= new_lock;
new->table= table;
for (new->next= table->wait_queue_in ; ; )
/* and try to place it */
for (new->prev= table->wait_queue_in ; ; )
{
if (!old && new->next)
/* waiting queue is not empty and we're not upgrading */
if (!old && new->prev)
{
/* need to wait */
DBUG_ASSERT(table->wait_queue_out);
DBUG_ASSERT(table->wait_queue_in);
blocker= new->next;
blocker= new->prev;
/* wait for a previous lock in the queue or for a lock it's waiting for */
if (lock_compatibility_matrix[blocker->lock_type][lock])
wait_for= lm->loid_to_lo(blocker->loid)->waiting_for;
wait_for= lm->loid_to_tlo(blocker->loid)->waiting_for;
else
wait_for= lm->loid_to_lo(blocker->loid);
wait_for= lm->loid_to_tlo(blocker->loid);
}
else
{
......@@ -188,24 +312,27 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
{
if (table->active_locks[i] && !lock_compatibility_matrix[i+1][lock])
{
/* the first lock in the list may be our own - skip it */
for (blocker= table->active_locks[i];
blocker && blocker->loid == lo->loid;
blocker= blocker->next);
blocker= blocker->next) /* no-op */;
if (blocker)
break;
}
}
if (!blocker)
if (!blocker) /* free to go */
break;
wait_for= lm->loid_to_lo(blocker->loid);
wait_for= lm->loid_to_tlo(blocker->loid);
}
/* ok, we're here - the wait is inevitable */
lo->waiting_for= wait_for;
if (!lo->waiting_lock) /* first iteration */
if (!lo->waiting_lock) /* first iteration of the for() loop */
{
/* lock upgrade or new lock request ? */
if (old)
{
/* upgrade - add the lock to the _start_ of the wait queue */
new->prev= 0;
if ((new->next= table->wait_queue_out))
new->next->prev= new;
......@@ -215,6 +342,7 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
}
else
{
/* new lock - add the lock to the _end_ of the wait queue */
new->next= 0;
if ((new->prev= table->wait_queue_in))
new->prev->next= new;
......@@ -222,7 +350,6 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
if (!table->wait_queue_out)
table->wait_queue_out=table->wait_queue_in;
}
lo->waiting_lock= new;
deadline= my_getsystime() + lm->lock_timeout * 10000;
......@@ -238,6 +365,7 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
}
}
/* now really wait */
pthread_mutex_lock(wait_for->mutex);
pthread_mutex_unlock(& table->mutex);
......@@ -245,32 +373,30 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
pthread_mutex_unlock(wait_for->mutex);
pthread_mutex_lock(& table->mutex);
/* ... and repeat from the beginning */
}
/* yeah! we can place the lock now */
/* remove the lock from the wait queue, if it was there */
if (lo->waiting_lock)
{
if (new->prev)
new->prev->next= new->next;
if (new->next)
new->next->prev= new->prev;
if (table->wait_queue_in == new)
table->wait_queue_in= new->prev;
if (table->wait_queue_out == new)
table->wait_queue_out= new->next;
remove_from_wait_queue(new, table);
lo->waiting_lock= 0;
lo->waiting_for= 0;
}
/* add it to the list of all locks of this lock owner */
new->next_in_lo= lo->active_locks;
lo->active_locks= new;
/* and to the list of active locks of this lock type */
new->prev= 0;
if ((new->next= table->active_locks[new_lock-1]))
new->next->prev= new;
table->active_locks[new_lock-1]= new;
/* placing the lock */
hash_insert(& table->active, (byte *)new);
/* remove the old lock from the hash, if upgrading */
if (old)
{
new->upgraded_from= old;
......@@ -279,45 +405,69 @@ tablockman_getlock(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo, LOCKED_TABLE *table, en
else
new->upgraded_from= 0;
/* and add a new lock to the hash, voila */
hash_insert(& table->active, (byte *)new);
pthread_mutex_unlock(& table->mutex);
return getlock_result[lock][lock];
}
/*
DESCRIPTION
release all locks belonging to a transaction.
signal waiters to continue
*/
void tablockman_release_locks(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo)
{
TABLE_LOCK *lock, *tmp, *local_pool= 0, *local_pool_end;
TABLE_LOCK *lock, *local_pool= 0, *local_pool_end;
/*
instead of adding released locks to a pool one by one, we'll link
them in a list and add to a pool in one short action (under a mutex)
*/
local_pool_end= lo->waiting_lock ? lo->waiting_lock : lo->active_locks;
if (!local_pool_end)
return;
/* release a waiting lock, if any */
if ((lock= lo->waiting_lock))
{
DBUG_ASSERT(lock->loid == lo->loid);
pthread_mutex_lock(& lock->table->mutex);
if (lock->prev)
lock->prev->next= lock->next;
if (lock->next)
lock->next->prev= lock->prev;
if (lock->table->wait_queue_in == lock)
lock->table->wait_queue_in= lock->prev;
if (lock->table->wait_queue_out == lock)
lock->table->wait_queue_out= lock->next;
remove_from_wait_queue(lock, lock->table);
/*
a special case: if this lock was not the last in the wait queue
and it's compatible with the next lock, than the next lock
is waiting for our blocker though really it waits for us, indirectly.
Signal our blocker to release this next lock (after we removed our
lock from the wait queue, of course).
*/
if (lock->prev &&
lock_compatibility_matrix[lock->prev->lock_type][lock->lock_type])
{
pthread_mutex_lock(lo->waiting_for->mutex);
pthread_cond_broadcast(lo->waiting_for->cond);
pthread_mutex_unlock(lo->waiting_for->mutex);
}
lo->waiting_for= 0;
pthread_mutex_unlock(& lock->table->mutex);
lock->next= local_pool;
local_pool= lock;
DBUG_ASSERT(lock->loid == lo->loid);
}
/* now release granted locks */
lock= lo->active_locks;
while (lock)
{
TABLE_LOCK *cur= lock;
pthread_mutex_t *mutex= & lock->table->mutex;
DBUG_ASSERT(cur->loid == lo->loid);
lock= lock->next_in_lo;
/* TODO ? group locks by table to reduce the number of mutex locks */
pthread_mutex_lock(mutex);
hash_delete(& cur->table->active, (byte *)cur);
......@@ -330,17 +480,21 @@ void tablockman_release_locks(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo)
cur->next= local_pool;
local_pool= cur;
DBUG_ASSERT(cur->loid == lo->loid);
pthread_mutex_unlock(mutex);
}
lo->waiting_lock= lo->active_locks= 0;
/*
okay, all locks released. now signal that we're leaving,
in case somebody's waiting for it
*/
pthread_mutex_lock(lo->mutex);
pthread_cond_broadcast(lo->cond);
pthread_mutex_unlock(lo->mutex);
/* and push all freed locks to the lockman's pool */
pthread_mutex_lock(& lm->pool_mutex);
local_pool_end->next= lm->pool;
lm->pool= local_pool;
......@@ -350,7 +504,7 @@ void tablockman_release_locks(TABLOCKMAN *lm, TABLE_LOCK_OWNER *lo)
void tablockman_init(TABLOCKMAN *lm, loid_to_tlo_func *func, uint timeout)
{
lm->pool= 0;
lm->loid_to_lo= func;
lm->loid_to_tlo= func;
lm->lock_timeout= timeout;
pthread_mutex_init(&lm->pool_mutex, MY_MUTEX_INIT_FAST);
}
......@@ -366,12 +520,12 @@ void tablockman_destroy(TABLOCKMAN *lm)
pthread_mutex_destroy(&lm->pool_mutex);
}
void tablockman_init_locked_table(LOCKED_TABLE *lt)
void tablockman_init_locked_table(LOCKED_TABLE *lt, int initial_hash_size)
{
TABLE_LOCK *unused;
bzero(lt, sizeof(*lt));
pthread_mutex_init(& lt->mutex, MY_MUTEX_INIT_FAST);
hash_init(& lt->active, &my_charset_bin, 10/*FIXME*/,
hash_init(& lt->active, &my_charset_bin, initial_hash_size,
offsetof(TABLE_LOCK, loid), sizeof(unused->loid), 0, 0, 0);
}
......@@ -381,6 +535,7 @@ void tablockman_destroy_locked_table(LOCKED_TABLE *lt)
pthread_mutex_destroy(& lt->mutex);
}
#ifdef EXTRA_DEBUG
static char *lock2str[LOCK_TYPES+1]= {"N", "S", "X", "IS", "IX", "SIX",
"LS", "LX", "SLX", "LSIX"};
......@@ -396,4 +551,5 @@ void print_tlo(TABLE_LOCK_OWNER *lo)
printf("!");
printf("\n");
}
#endif
......@@ -51,34 +51,38 @@ typedef TABLE_LOCK_OWNER *loid_to_tlo_func(uint16);
typedef struct {
pthread_mutex_t pool_mutex;
TABLE_LOCK *pool;
TABLE_LOCK *pool; /* lifo pool of free locks */
uint lock_timeout;
loid_to_tlo_func *loid_to_lo;
loid_to_tlo_func *loid_to_tlo; /* for mapping loid to TABLE_LOCK_OWNER */
} TABLOCKMAN;
struct st_table_lock_owner {
TABLE_LOCK *active_locks, *waiting_lock;
TABLE_LOCK_OWNER *waiting_for;
pthread_cond_t *cond; /* transactions waiting for this, wait on 'cond' */
TABLE_LOCK *active_locks; /* list of active locks */
TABLE_LOCK *waiting_lock; /* waiting lock (one lock only) */
TABLE_LOCK_OWNER *waiting_for; /* transaction we're wating for */
pthread_cond_t *cond; /* transactions waiting for us, wait on 'cond' */
pthread_mutex_t *mutex; /* mutex is required to use 'cond' */
uint16 loid;
uint16 loid; /* Lock Owner IDentifier */
};
struct st_locked_table {
pthread_mutex_t mutex;
HASH active; // fast to remove
TABLE_LOCK *active_locks[LOCK_TYPES]; // fast to see a conflict
TABLE_LOCK *wait_queue_in, *wait_queue_out;
pthread_mutex_t mutex; /* mutex for everything below */
HASH active; /* active locks ina hash */
TABLE_LOCK *active_locks[LOCK_TYPES]; /* dl-list of locks per type */
TABLE_LOCK *wait_queue_in, *wait_queue_out; /* wait deque */
};
void tablockman_init(TABLOCKMAN *, loid_to_tlo_func *, uint);
void tablockman_destroy(TABLOCKMAN *);
enum lockman_getlock_result tablockman_getlock(TABLOCKMAN *, TABLE_LOCK_OWNER *,
LOCKED_TABLE *,
enum lock_type lock);
LOCKED_TABLE *, enum lock_type);
void tablockman_release_locks(TABLOCKMAN *, TABLE_LOCK_OWNER *);
void tablockman_init_locked_table(LOCKED_TABLE *);
void tablockman_init_locked_table(LOCKED_TABLE *, int);
void tablockman_destroy_locked_table(LOCKED_TABLE *);
#ifdef EXTRA_DEBUG
void print_tlo(TABLE_LOCK_OWNER *);
#endif
#endif
......@@ -288,10 +288,10 @@ int main()
for (i= 0; i < Ntbls; i++)
{
tablockman_init_locked_table(ltarray+i);
tablockman_init_locked_table(ltarray+i, Nlos);
}
//test_tablockman_simple();
test_tablockman_simple();
#define CYCLES 10000
#define THREADS Nlos /* don't change this line */
......
......@@ -115,6 +115,20 @@ void test_tablockman_simple()
lock_conflict(2, 1, X);
unlock_all(1);
unlock_all(2);
lock_ok_i(1, 1, IS);
lock_conflict(2, 1, X);
lock_conflict(3, 1, IS);
unlock_all(1);
unlock_all(2);
unlock_all(3);
lock_ok_a(1, 1, S);
lock_conflict(2, 1, IX);
lock_conflict(3, 1, IS);
unlock_all(1);
unlock_all(2);
unlock_all(3);
}
int rt_num_threads;
......@@ -273,7 +287,7 @@ int main()
for (i= 0; i < Ntbls; i++)
{
tablockman_init_locked_table(ltarray+i);
tablockman_init_locked_table(ltarray+i, Nlos);
}
test_tablockman_simple();
......@@ -285,7 +299,7 @@ int main()
Nrows= 100;
Ntables= 10;
table_lock_ratio= 10;
run_test("\"random lock\" stress test", test_lockman, THREADS, CYCLES);
//run_test("\"random lock\" stress test", test_lockman, THREADS, CYCLES);
#if 0
/* "real-life" simulation - many rows, no table locks */
Nrows= 1000000;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment