Commit fff5c910 authored by unknown's avatar unknown

two-line fix for BUG#6732 "FLUSH TABLES WITH READ LOCK + COMMIT makes next...

two-line fix for BUG#6732 "FLUSH TABLES WITH READ LOCK + COMMIT makes next FLUSH...LOCK hang forever"
(originally reported as "second run of innobackup hangs forever and can even hang server").
Plus testcase for the bugfix and comments about global read locks.


mysql-test/r/flush_block_commit.result:
  result update
mysql-test/t/flush_block_commit.test:
  testing bugfix (originally: second run of innobackup hangs)
sql/lock.cc:
  When we are in start_waiting_global_read_lock(), if we ourselves have
  the global read lock, there is nothing to start. This makes a pair with how 
  wait_if_global_read_lock() behaves when we ourselves have the global read lock.
  Previously, start_waiting_global_read_lock() decremented protect... whereas wait_if_global_read_lock()
  hadn't incremented it => very wrong => hangs.
  Descriptive comments on how global read lock works.
parent 5889a531
......@@ -20,4 +20,12 @@ commit;
a
1
unlock tables;
commit;
begin;
insert into t1 values(10);
flush tables with read lock;
commit;
unlock tables;
flush tables with read lock;
unlock tables;
drop table t1;
......@@ -45,5 +45,19 @@ reap;
connection con3;
reap;
unlock tables;
# BUG#6732 FLUSH TABLES WITH READ LOCK + COMMIT hangs later FLUSH TABLES
# WITH READ LOCK
connection con2;
commit; # unlock InnoDB row locks to allow insertions
connection con1;
begin;
insert into t1 values(10);
flush tables with read lock;
commit;
unlock tables;
connection con2;
flush tables with read lock; # bug caused hang here
unlock tables;
drop table t1;
......@@ -665,15 +665,70 @@ static void print_lock_error(int error)
/****************************************************************************
Handling of global read locks
Taking the global read lock is TWO steps (2nd step is optional; without
it, COMMIT of existing transactions will be allowed):
lock_global_read_lock() THEN make_global_read_lock_block_commit().
The global locks are handled through the global variables:
global_read_lock
count of threads which have the global read lock (i.e. have completed at
least the first step above)
global_read_lock_blocks_commit
waiting_for_read_lock
count of threads which have the global read lock and block
commits (i.e. have completed the second step above)
waiting_for_read_lock
count of threads which want to take a global read lock but cannot
protect_against_global_read_lock
count of threads which have set protection against global read lock.
How blocking of threads by global read lock is achieved: that's
advisory. Any piece of code which should be blocked by global read lock must
be designed like this:
- call to wait_if_global_read_lock(). When this returns 0, no global read
lock is owned; if argument abort_on_refresh was 0, none can be obtained.
- job
- if abort_on_refresh was 0, call to start_waiting_global_read_lock() to
allow other threads to get the global read lock. I.e. removal of the
protection.
(Note: it's a bit like an implementation of rwlock).
[ I am sorry to mention some SQL syntaxes below I know I shouldn't but found
no better descriptive way ]
Why does FLUSH TABLES WITH READ LOCK need to block COMMIT: because it's used
to read a non-moving SHOW MASTER STATUS, and a COMMIT writes to the binary
log.
Why getting the global read lock is two steps and not one. Because FLUSH
TABLES WITH READ LOCK needs to insert one other step between the two:
flushing tables. So the order is
1) lock_global_read_lock() (prevents any new table write locks, i.e. stalls
all new updates)
2) close_cached_tables() (the FLUSH TABLES), which will wait for tables
currently opened and being updated to close (so it's possible that there is
a moment where all new updates of server are stalled *and* FLUSH TABLES WITH
READ LOCK is, too).
3) make_global_read_lock_block_commit().
If we have merged 1) and 3) into 1), we would have had this deadlock:
imagine thread 1 and 2, in non-autocommit mode, thread 3, and an InnoDB
table t.
thd1: SELECT * FROM t FOR UPDATE;
thd2: UPDATE t SET a=1; # blocked by row-level locks of thd1
thd3: FLUSH TABLES WITH READ LOCK; # blocked in close_cached_tables() by the
table instance of thd2
thd1: COMMIT; # blocked by thd3.
thd1 blocks thd2 which blocks thd3 which blocks thd1: deadlock.
Note that we need to support that one thread does
FLUSH TABLES WITH READ LOCK; and then COMMIT;
(that's what innobackup does, for some good reason).
So in this exceptional case the COMMIT should not be blocked by the FLUSH
TABLES WITH READ LOCK.
TODO in MySQL 5.x: make_global_read_lock_block_commit() should be
killable. Normally CPU does not spend a long time in this function (COMMITs
are quite fast), but it would still be nice.
Taking the global read lock is TWO steps (2nd step is optional; without
it, COMMIT of existing transactions will be allowed):
lock_global_read_lock() THEN make_global_read_lock_block_commit().
****************************************************************************/
volatile uint global_read_lock=0;
......@@ -783,6 +838,8 @@ void start_waiting_global_read_lock(THD *thd)
{
bool tmp;
DBUG_ENTER("start_waiting_global_read_lock");
if (unlikely(thd->global_read_lock))
DBUG_VOID_RETURN;
(void) pthread_mutex_lock(&LOCK_open);
tmp= (!--protect_against_global_read_lock && waiting_for_read_lock);
(void) pthread_mutex_unlock(&LOCK_open);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment