Commit 8afcdfe9 authored by Thayumanavar's avatar Thayumanavar

BUG#14458232 - CRASH IN THD_IS_TRANSACTION_ACTIVE DURING

               THREAD POOLING STRESS TEST
PROBLEM:
Connection stress tests which consists of concurrent
kill connections interleaved with mysql ping queries
cause the mysqld server which uses thread pool scheduler
to crash.
FIX:
Killing a connection involves shutdown and close of client
socket and this can cause EPOLLHUP(or EPOLLERR) events to be
to be queued and handled after disarming and cleanup of 
of the connection object (THD) is being done.We disarm the 
the connection by modifying the epoll mask to zero which
ensure no events come and release the ownership of waiting 
thread that collect events and then do the cleanup of THD.
object.As per the linux kernel epoll source code (               
http://lxr.linux.no/linux+*/fs/eventpoll.c#L1771), EPOLLHUP
(or EPOLLERR) can't be masked even if we set EPOLL mask
to zero. So we disarm the connection and thus prevent 
execution of any query processing handler/queueing to 
client ctx. queue by removing the client fd from the epoll        
set via EPOLL_CTL_DEL. Also there is a race condition which
involve the following threads:
1) Thread X executing KILL CONNECTION Y and is in THD::awake
and using mysys_var (holding LOCK_thd_data).
2) Thread Y in tp_process_event executing and is being killed.
3) Thread Z receives KILL flag internally and possible call
the tp_thd_cleanup function which set thread session variable
and changing mysys_var.
The fix for the above race is to set thread session variable
under LOCK_thd_data.
We also do not call THD::awake if we found the thread in the
thread list that is to be killed but it's KILL_CONNECTION flag
set thus avoiding any possible concurrent cleanup. This patch
is approved by Mikael Ronstrom via email review.
parent 2512ee8b
...@@ -61,6 +61,7 @@ uint thd_get_net_read_write(THD *thd); ...@@ -61,6 +61,7 @@ uint thd_get_net_read_write(THD *thd);
void thd_set_mysys_var(THD *thd, st_my_thread_var *mysys_var); void thd_set_mysys_var(THD *thd, st_my_thread_var *mysys_var);
ulong thd_get_net_wait_timeout(THD *thd); ulong thd_get_net_wait_timeout(THD *thd);
my_socket thd_get_fd(THD *thd); my_socket thd_get_fd(THD *thd);
int thd_store_globals(THD* thd);
THD *first_global_thread(); THD *first_global_thread();
THD *next_global_thread(THD *thd); THD *next_global_thread(THD *thd);
......
...@@ -468,6 +468,18 @@ my_socket thd_get_fd(THD *thd) ...@@ -468,6 +468,18 @@ my_socket thd_get_fd(THD *thd)
return thd->net.vio->sd; return thd->net.vio->sd;
} }
/**
Set thread specific environment required for thd cleanup in thread pool.
@param thd THD object
@retval 1 if thread-specific enviroment could be set else 0
*/
int thd_store_globals(THD* thd)
{
return thd->store_globals();
}
/** /**
Get thread attributes for connection threads Get thread attributes for connection threads
......
...@@ -6471,8 +6471,14 @@ uint kill_one_thread(THD *thd, ulong id, bool only_kill_query) ...@@ -6471,8 +6471,14 @@ uint kill_one_thread(THD *thd, ulong id, bool only_kill_query)
if ((thd->security_ctx->master_access & SUPER_ACL) || if ((thd->security_ctx->master_access & SUPER_ACL) ||
thd->security_ctx->user_matches(tmp->security_ctx)) thd->security_ctx->user_matches(tmp->security_ctx))
{ {
tmp->awake(only_kill_query ? THD::KILL_QUERY : THD::KILL_CONNECTION); /* process the kill only if thread is not already undergoing any kill
error=0; connection.
*/
if (tmp->killed != THD::KILL_CONNECTION)
{
tmp->awake(only_kill_query ? THD::KILL_QUERY : THD::KILL_CONNECTION);
}
error= 0;
} }
else else
error=ER_KILL_DENIED_ERROR; error=ER_KILL_DENIED_ERROR;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment