MDEV-24097: galera[_3nodes] suite tests in MTR sporadically fails
This is the first part of the fixes for MDEV-24097. This commit contains the fixes for instability when testing Galera and when restarting nodes quickly: 1) Protection against a "stuck" old SST process during the execution of the new SST (after restarting the node) is now implemented for mariabackup / xtrabackup, which should help to avoid almost all conflicts due to the use of the same ports - both during testing with mtr, so and when restarting nodes quickly in a production environment. 2) Added more protection to scripts against unexpected return of the rc != 0 (in the commands for deleting temporary files, etc). 3) Added protection against unexpected crashes during binlog transfer (in SST scripts for rsync). 4) Spaces and some special characters in binlog filenames shouldn't be a problem now (at the script level). 5) Daemon process termination tracking has been made more robust against crashes due to unexpected termination of the previous SST process while new scripts are running. 6) Reading ssl encryption parameters has been moved from specific SST scripts to a common wsrep_sst_common.sh script, which allows unified error handling, unified diagnostics and simplifies script revisions in the future. 7) Improved diagnostics of errors related to the use of openssl. 8) Corrections have been made for xtrabackup-v2 (both in tests and in the script code) that restore the work of xtrabackup with updated versions of innodb. 9) Fixed some tests for galera_3nodes, although the complete solution for the problem of starting three nodes at the same time on fast machines will be done in a separate commit. No additional tests are required as this commit fixes problems with existing tests.
Showing
Please register or sign in to comment