Bug#17599258:- ERROR 1160 (08S01): GOT AN ERROR WRITING
COMMUNICATION PACKETS; FEDERATED TABLE Description:- Execution of FLUSH TABLES on a federated table which has been idle for wait_timeout (on the remote server) + tcp_keepalive_time, fails with an error, "ERROR 1160 (08S01): Got an error writing communication packets." Analysis:- During FLUSH TABLE execution the federated table is closed which will inturn close the federated connection. While closing the connection, federated server tries to communincate with the remote server. Since the connection was idle for wait_timeout(on the remote server)+ tcp_keepalive_time, the socket gets closed. So this communication fails because of broken pipe and the error is thrown. But federated connections are expected to reconnect silently. And also it cannot reconnect because the "auto_reconnect" variable is set to 0 in "mysql_close()". Fix:- Before closing the federated connection, in "ha_federated_close()", a check is added which will verify wheather the connection is alive or not. If the connection is not alive, then "mysql->net.error" is set to 2 which will indicate that the connetion is broken. Also the setting of "auto_reconnect" variable to 0 is delayed and is done after "COM_QUIT" command. NOTE:- For reproducing this issue, "tcp_keepalive_time" has to be set to a smaller value. This value is set in the "/proc/sys/net/ipv4/tcp_keepalive_time" file in Unix systems. So we need root permission for changing it, which can't be done through mtr test. So submitting the patch without mtr test.
Showing
Please register or sign in to comment