1. 06 Sep, 2023 2 commits
  2. 05 Sep, 2023 6 commits
    • Aleksey Midenkov's avatar
      MDEV-30836 MTR hangs after tests have completed · a49b9314
      Aleksey Midenkov authored
      The problem is in manager/worker communication when worker sends
      WARNINGS and then TESTRESULT. If manager yet didn't read WARNINGS
      response both responses get into the same buffer, can_read() will
      indicate we have data only once and we must read all the data from the
      socket at once. Otherwise TESTRESULT response is lost and manager
      waits it forever.
      
      The fix now instead of single line reads the socket in a loop. But if
      there is only one response in the buffer the second read will be
      blocked waiting until new data arrives. That can be overcame by
      blocking(0) which sets the handle into non-blocking mode. If there is
      no data second read just returns undef.
      
      The problem is non-blocking mode is not supported by all perl flavors
      on Windows. Strawberry and ActiveState do not support it. Cygwin and
      MSYS2 do support. There is some ioctl() hack that was known to "work"
      but it doesn't do what is expected (it does not return data when there
      is data). So for Windows if it is not Cygwin we disable the fix.
      a49b9314
    • Aleksey Midenkov's avatar
      MDEV-30836 MTR MSYS2 fix attempt · 848b3af8
      Aleksey Midenkov authored
      MSYS2 is basically Cygwin, except it has more easy installation (but
      with tools which are not used) and it has some more control of path
      conversion via MSYS2_ARG_CONV_EXCL and MSYS2_ENV_CONV_EXCL. So it
      should be more Windows-friendly than Cygwin.
      
      Installation
      
      Similar to Cygwin, except installing patch requires additional command
      run from shell:
      
          pacman -S patch
      
      MSYS2 still doesn't work as it returns wierd "Bad address" when
      exec-ing forked process from create_process(). Same exec from
      standalone perl -e runs just fine... :(
      848b3af8
    • Aleksey Midenkov's avatar
      MDEV-30836 MTR Cygwin fix · 640cd404
      Aleksey Midenkov authored
      Cygwin is more Unix-oriented. It does not treat \n as \r\n in regexps
      (fixed by \R), it supplies Unix-style paths (fixed by
      mixed_path()). It does some cleanup on paths when running exe, so it
      will be different in exe output (like with $exe_mysqld, comparing
      basename() is enough).
      
      Cygwin installation
      
      1. Just install latest perl version (only base package) and
         patchutils from cygwin-setup;
      2. Don't forget to add c:\cygwin64\bin into system path
         before any other perl flavors;
      3. There is path-style conflict (see below), you must replace
         c:\cygwin64\bin\sh.exe with the wrapper. Run MTR with
         --cygwin-subshell-fix=do for that. Make sure you are running Cygwin
         perl for the option to work.
      4. Restart buildbot via net stop buildbot; net start buildbot
      
      Path-style conflict of Cygwin-ish Perl
      
      Some exe paths are passed to mysqltest which are executed by a native
      call. This requires native-style paths (\-style). These exe paths also
      executed by Perl itself. Either by MTR itself which is not so
      critical, but also by tests' --perl blocks which is impossible to
      change. And if Perl detects shell-expansion or uses pipe command it
      passess this exe path to /bin/sh which is Cygwin-compiled bash that
      cannot work with \-style (or at least in -c processing). Thus we require
      \-style on some parts of MTR execution and /-style on another parts.
      
      The examples of tests which cover these different parts are:
      
          main.mysqlbinlog_row_compressed \
          main.sp_trans_log
      
      That could be great to force Perl to use something different from
      /bin/sh, but unfortunately /bin/sh is compiled-in into binary. So the
      only solution left is to overwrite /bin/sh with some wrapper script
      which passes the command to cmd.exe instead of bash.
      640cd404
    • Aleksey Midenkov's avatar
      MDEV-30836 MTR Cygwin subshell wrapper fix · 4ed58303
      Aleksey Midenkov authored
      See "Path-style conflict" in "MDEV-30836 MTR Cygwin fix" for explanation.
      
      To install subshell fix use --cygwin-subshell-fix=do
      To uninstall use --cygwin-subshell-fix=remove
      
      This works only from Cygwin environment. As long as perl on PATH is
      from Cygwin you are on Cygwin environment. Check it with
      
           perl --version
      
           This is perl 5, version 36, subversion 1 (v5.36.1) built for
           x86_64-cygwin-threads-multi
      4ed58303
    • Aleksey Midenkov's avatar
      MDEV-30836 run_test_server() refactored · 0815a3b6
      Aleksey Midenkov authored
      run_test_server() is actually manager main loop. We move out this
      function into Manager package and split into run() and
      parse_protocol(). The latter is needed for the fix. Moving into
      separate package helps to make some common variables which was local
      to run_test_server().
      
      Functions from the main package is now prefixed with main:: (should be
      reorganized somehow later or auto-imported).
      0815a3b6
    • Aleksey Midenkov's avatar
      MDEV-30836 MTR misc improvements · 92fb31f0
      Aleksey Midenkov authored
      1. Better logging and error reporting;
      2. Worker process title;
      3. Some comments
      
      Worker process title example:
      
       446209 pts/2    R+     0:00 mysql-test-run.pl worker[01] :42146 -> :35027 versioning.view
       446210 pts/2    S+     0:00 mysql-test-run.pl worker[02] :42150 -> :35027 versioning.view
       446211 pts/2    S+     0:00 mysql-test-run.pl worker[03] :42154 -> :35027 versioning.foreign
       446212 pts/2    S+     0:00 mysql-test-run.pl worker[04] :42160 -> :35027 versioning.autoinc
      
      Manager-worker localhost socket connection is represented by a pair :source -> :destination ports.
      
      -vv Now adds --verbose to mysqltest as well, see var/mysqltest.log for the output.
      92fb31f0
  3. 04 Sep, 2023 1 commit
    • Daniel Black's avatar
      MDEV-25177 Better indication of refusing to start because of ProtectHome · 91ab8194
      Daniel Black authored
      Create test for for case insensitive gives a basic warning on creating
      a test file and the next thing a user might see is an abort.
      
      ProtectHome and other systemd setting protect system services from
      accessing user data. Unfortunately some of our users do put things
      on /home due space or other reasons.
      
      Rather than enumberate the systemd options in a very clunkly fragile
      way we put an error associated with the "Can't create test file" and
      hope the user can work it out from there.
      
      %M tip thanks Sergei.
      91ab8194
  4. 02 Sep, 2023 5 commits
    • Dmitry Shulga's avatar
      MDEV-14959: Fixed memory leak relating with view and IS · d0a872c2
      Dmitry Shulga authored
      Fixed memory leak taken place on executing a prepared statement or
      a stored routine that querying a view and this view constructed
      on an information schema table. For example,
      
      Lets consider the following definition of the view 'v1'
      CREATE VIEW v1 AS SELECT table_name FROM information_schema.views
      ORDER BY table_name;
      
      Querying this view in PS mode result in hit of assert.
      PREPARE stmt FROM "SELECT * FROM v1";
      EXECUTE stmt;
      EXECUTE stmt; (*)
      
      Running the statement marked with (*) leads to a crash in case
      server build with mode to control allocation of a memory from SP/PS
      memory root on the second and following executions of PS/SP.
      
      The reason of leaking the memory is that a memory allocated on
      processing of FRM file for the view requested from a PS/PS memory
      root meaning that this memory be released only when a stored routine
      be evicted from SP-cache or a prepared statement be deallocated
      that typically happens on termination of a user session.
      
      To fix the issue switch to a memory root specially created for
      allocation of short-lived objects that requested on parsing FRM.
      d0a872c2
    • Dmitry Shulga's avatar
      MDEV-14959: Fixed memory leak happened on re-parsing a view that substitutes a table · be023562
      Dmitry Shulga authored
      In case a table accessed by a PS/SP is dropped after the first execution of
      PS/SP and a view created with the same name as a table just dropped then
      the second execution of PS/SP leads to allocation of a memory on SP/PS
      memory root already marked as read only on first execution.
      
      For example, the following test case:
      CREATE TABLE t1 (a INT);
      PREPARE stmt FROM "INSERT INTO t1 VALUES (1)";
      EXECUTE stmt;
      DROP TABLE t1;
      CREATE VIEW t1 S SELECT 1;
      --error ER_NON_INSERTABLE_TABLE
      EXECUTE stmt; # (*)
      DROP VIEW t1;
      
      will hit assert on running the statement 'EXECUTE stmt' marked with (*)
      when allocation of a memory be performed on parsing the view.
      
      Memory allocation is requested inside the function mysql_make_view
      when a view definition being parsed. In order to avoid an assertion
      failure, call of the function mysql_make_view() must be moved after
      invocation of the function check_and_update_table_version().
      It will result in re-preparing the whole PS statement or current
      SP instruction that will free currently allocated items and reset
      read_only flag for the memory root.
      be023562
    • Dmitry Shulga's avatar
      MDEV-14959: Fixed possible memory leaks that could happen on running PS/SP depending on a trigger · 1d502a29
      Dmitry Shulga authored
      Moved call of the function check_and_update_table_version() just
      before the place where the function extend_table_list() is invoked
      in order to avoid allocation of memory on a PS/SP memory root
      marked as read only. It happens by the reason that the function
      extend_table_list() invokes sp_add_used_routine() to add a trigger
      created for the table in time frame between execution the statement
      EXECUTE `stmt_id` .
      
      For example, the following test case
      create table t1 (a int);
      
      prepare stmt from "insert into t1 (a) value (1)";
      execute stmt;
      
      create trigger t1_bi before insert on t1 for each row
        set @message= new.a;
      
      execute stmt; # (*)
      
      adds the trigger t1_bi to a list of used routines that involves
      allocation of a memory on PS memory root that has been already marked
      as read only on first run of the statement 'execute stmt'.
      In result, when the statement marked with (*) is executed it results in
      assert hit.
      
      To fix the issue call the function check_and_update_table_version()
      before invocation of extend_table_list() to force re-compilation of
      PS/SP that resets read-only flag of its memory root.
      1d502a29
    • Dmitry Shulga's avatar
      MDEV-14959: Moved calculation the number of items reserved for exists to in transformation · d8574dbb
      Dmitry Shulga authored
      It is done now before call of select_lex->setup_ref_array()
      in order to avoid allocation of SP/PS's memory on its second invocation.
      d8574dbb
    • Dmitry Shulga's avatar
      MDEV-14959: Control over memory allocated for SP/PS · 0d4be10a
      Dmitry Shulga authored
      This patch adds support for controlling of memory allocation
      done by SP/PS that could happen on second and following executions.
      As soon as SP or PS has been executed the first time its memory root
      is marked as read only since no further memory allocation should
      be performed on it. In case such allocation takes place it leads to
      the assert hit for invariant that force no new memory allocations
      takes place as soon as the SP/PS has been marked as read only.
      
      The feature for control of memory allocation made on behalf SP/PS
      is turned on when both debug build is on and the cmake option
      -DWITH_PROTECT_STATEMENT_MEMROOT is set.
      
      The reason for introduction of the new cmake option
        -DWITH_PROTECT_STATEMENT_MEMROOT
      to control memory allocation of second and following executions of
      SP/PS is that for the current server implementation there are too many
      places where such memory allocation takes place. As soon as all such
      incorrect allocations be fixed the cmake option
       -DWITH_PROTECT_STATEMENT_MEMROOT
      can be removed and control of memory allocation made on second and
      following executions can be turned on only for debug build. Before
      every incorrect memory allocation be fixed it makes sense to guard
      the checking of memory allocation on read only memory by extra cmake
      option else we would get a lot of failing test on buildbot.
      
      Moreover, fixing of all incorrect memory allocations could take pretty
      long period of time, so for introducing the feature without necessary
      to wait until all places throughout the source code be fixed it makes
      sense to add the new cmake option.
      0d4be10a
  5. 01 Sep, 2023 1 commit
  6. 28 Aug, 2023 1 commit
    • Dmitry Shulga's avatar
      MDEV-31890: Compilation failing on MacOS (unknown warning option -Wno-unused-but-set-variable) · 1fde7853
      Dmitry Shulga authored
      For clang compiler the compiler's flag -Wno-unused-but-set-variable
      was set based on compiler version. This approach could result in
      false positive detection for presence of compiler option since
      only first three groups of digits in compiler version taken into account
      and it could lead to inaccuracy in determining of supported compiler's
      features.
      
      Correct way to detect options supported by a compiler is to use
      the macros  MY_CHECK_CXX_COMPILER_FLAG and to check the result of
      variable with prefix have_CXX__
      So, to check whether compiler does support the option
       -Wno-unused-but-set-variable
      the macros
       MY_CHECK_CXX_COMPILER_FLAG(-Wno-unused-but-set-variable)
      should be called and the result variable
       have_CXX__Wno_unused_but_set_variable
      be tested for assigned value.
      1fde7853
  7. 24 Aug, 2023 1 commit
    • Marko Mäkelä's avatar
      MDEV-31813 SET GLOBAL innodb_max_purge_lag_wait hangs if innodb_read_only · 02878f12
      Marko Mäkelä authored
      innodb_max_purge_lag_wait_update(): Return immediately if we are
      in high_level_read_only mode.
      
      srv_wake_purge_thread_if_not_active(): Relax a debug assertion.
      If srv_read_only_mode holds, purge_sys.enabled() will not hold
      and this function will do nothing.
      
      trx_t::commit_in_memory(): Remove a redundant condition before
      invoking srv_wake_purge_thread_if_not_active().
      02878f12
  8. 23 Aug, 2023 1 commit
    • Yuchen Pei's avatar
      MDEV-31117 Fix spider connection info parsing · e9f3ca61
      Yuchen Pei authored
      Spider connection string is a comma-separated parameter definitions,
      where each definition is of the form "<param_title> <param_value>",
      where <param_value> is quote delimited on both ends, with backslashes
      acting as an escaping prefix.
      
      Despite the simple syntax, the existing spider connection string
      parser was poorly-written, complex, hard to reason and error-prone,
      causing issues like the one described in MDEV-31117. For example it
      treated param title the same way as param value when assigning, and
      have nonsensical fields like delim_title_len and delim_title.
      
      Thus as part of the bugfix, we clean up the spider comment connection
      string parsing, including:
      
      - Factoring out some code from the parsing function
      - Simplify the struct `st_spider_param_string_parse`
      - And any necessary changes caused by the above changes
      e9f3ca61
  9. 22 Aug, 2023 1 commit
    • Marko Mäkelä's avatar
      MDEV-20194 test adjustment for s390x · ff682ead
      Marko Mäkelä authored
      The test innodb.row_size_error_log_warnings_3 that was added in
      commit 372b0e63 (MDEV-20194)
      failed to take into account the earlier adjustment in
      commit cf574cf5 (MDEV-27634)
      that is specific to many GNU/Linux distributions for the s390x.
      ff682ead
  10. 21 Aug, 2023 1 commit
  11. 17 Aug, 2023 3 commits
    • Marko Mäkelä's avatar
      MDEV-31928 Assertion xid ... < 128 failed in trx_undo_write_xid() · 5a8a8fc9
      Marko Mäkelä authored
      trx_undo_write_xid(): Correct an off-by-one error in a debug assertion.
      5a8a8fc9
    • Marko Mäkelä's avatar
      MDEV-31254 InnoDB: Trying to read doublewrite buffer page · 518fe519
      Marko Mäkelä authored
      buf_read_page_low(): Remove an error message that could be triggered
      by buf_read_ahead_linear() or buf_read_ahead_random().
      
      This is a backport of commit c9eff1a1
      from MariaDB Server 10.5.
      518fe519
    • Marko Mäkelä's avatar
      MDEV-31875 ROW_FORMAT=COMPRESSED table: InnoDB: ... Only 0 bytes read · 44df6f35
      Marko Mäkelä authored
      buf_read_ahead_random(), buf_read_ahead_linear(): Avoid read-ahead
      of the last page(s) of ROW_FORMAT=COMPRESSED tablespaces that use
      a page size of 1024 or 2048 bytes. We invoke os_file_set_size() on
      integer multiples of 4096 bytes in order to be compatible with
      the requirements of innodb_flush_method=O_DIRECT regardless of the
      physical block size of the underlying storage.
      
      This change must be null-merged to MariaDB Server 10.5 and later.
      There, out-of-bounds read-ahead should be handled gracefully
      by simply discarding the buffer page that had been allocated.
      
      Tested by: Matthias Leich
      44df6f35
  12. 16 Aug, 2023 1 commit
    • Kristian Nielsen's avatar
      MDEV-29974: Missed kill waiting for worker queues to drain · 34e85854
      Kristian Nielsen authored
      When the SQL driver thread goes to wait for room in the parallel slave
      worker queue, there was a race where a kill at the right moment could
      be ignored and the wait proceed uninterrupted by the kill.
      
      Fix by moving the THD::check_killed() to occur _after_ doing ENTER_COND().
      
      This bug was seen as sporadic failure of the testcase rpl.rpl_parallel
      (rpl.rpl_parallel_gco_wait_kill since 10.5), with "Slave stopped with
      wrong error code".
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      34e85854
  13. 15 Aug, 2023 6 commits
    • Kristian Nielsen's avatar
      MDEV-31655: Parallel replication deadlock victim preference code errorneously removed · 900c4d69
      Kristian Nielsen authored
      Restore code to make InnoDB choose the second transaction as a deadlock
      victim if two transactions deadlock that need to commit in-order for
      parallel replication. This code was erroneously removed when VATS was
      implemented in InnoDB.
      
      Also add a test case for InnoDB choosing the right deadlock victim.
      Also fixes this bug, with testcase that reliably reproduces:
      
      MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
      
      Note: This should be null-merged to 10.6, as a different fix is needed
      there due to InnoDB locking code changes.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      900c4d69
    • Kristian Nielsen's avatar
      MDEV-31482: Lock wait timeout with INSERT-SELECT, autoinc, and statement-based replication · 920789e9
      Kristian Nielsen authored
      Remove the exception that InnoDB does not report auto-increment locks waits
      to the parallel replication.
      
      There was an assumption that these waits could not cause conflicts with
      in-order parallel replication and thus need not be reported. However, this
      assumption is wrong and it is possible to get conflicts that lead to hangs
      for the duration of --innodb-lock-wait-timeout. This can be seen with three
      transactions:
      
      1. T1 is waiting for T3 on an autoinc lock
      2. T2 is waiting for T1 to commit
      3. T3 is waiting on a normal row lock held by T2
      
      Here, T3 needs to be deadlock killed on the wait by T1.
      
      Note: This should be null-merged to 10.6, as a different fix is needed
      there due to InnoDB lock code changes.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      920789e9
    • Marko Mäkelä's avatar
      Remove the often-hanging test innodb.alter_rename_files · b4ace139
      Marko Mäkelä authored
      The test innodb.alter_rename_files rather frequently hangs in
      checkpoint_set_now. The test was removed in MariaDB Server 10.5
      commit 37e7bde1 when the code that
      it aimed to cover was simplified. Starting with MariaDB Server 10.5
      the page flushing and log checkpointing is much simpler, handled
      by the single buf_flush_page_cleaner() thread.
      
      Let us remove the test to avoid occasional failures. We are not going
      to fix the cause of the failure in MariaDB Server 10.4.
      b4ace139
    • Marko Mäkelä's avatar
      Merge mariadb-10.4.31 into 10.4 · 6fdc6846
      Marko Mäkelä authored
      6fdc6846
    • Alexander Barkov's avatar
      MDEV-24797 Column Compression - ERROR 1265 (01000): Data truncated for column · 9c8ae6dc
      Alexander Barkov authored
      Fix issue was earlier fixed by MDEV-31724. Only adding MTR tests.
      9c8ae6dc
    • Alexander Barkov's avatar
      MDEV-31724 Compressed varchar values lost on joins when sorting on columns from joined table(s) · 1fa7c9a3
      Alexander Barkov authored
      Field_varstring::get_copy_func() did not take into account
      that functions do_varstring1[_mb], do_varstring2[_mb] do not support
      compressed data.
      
      Changing the return value of Field_varstring::get_copy_func()
      to `do_field_string` if there is a compresion and truncation
      at the same time. This fixes the problem, so now it works as follows:
      - val_str() uncompresses the data
      - The prefix is then calculated on the uncompressed data
      
      Additionally, introducing two new copying functions
      - do_varstring1_no_truncation()
      - do_varstring2_no_truncation()
      
      Using new copying functions in cases when:
      - a Field_varstring with length_bytes==1 is changing to a longer
          Field_varstring with length_bytes==1
      - a Field_varstring with length_bytes==2 is changing to a longer
          Field_varstring with length_bytes==2
      
      In these cases we don't care neither of compression nor
      of multi-byte prefixes: the entire data gets fully copied
      from the source column to the target column as is.
      
      This is a kind of new optimization, but this also was needed
      to preserve existing MTR test results.
      1fa7c9a3
  14. 14 Aug, 2023 1 commit
  15. 11 Aug, 2023 1 commit
  16. 10 Aug, 2023 4 commits
    • Monty's avatar
      MDEV-31893 Valgrind reports issues in main.join_cache_notasan · 2aea9387
      Monty authored
      This is also related to
      MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool
                 JOIN_CACHE_HASHED::put_record()
      
      Valgrind exposed a problem with the join_cache for hash joins:
      =25636== Conditional jump or move depends on uninitialised value(s)
      ==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table()
                (sql_join_cache.cc:2901)
      
      The reason for this was that avg_record_length contained a random value
      if one had used SET optimizer_switch='optimize_join_buffer_size=off'.
      
      This causes either 'random size' memory to be allocated (up to
      join_buffer_size) which can increase memory usage or, if avg_record_length
      is less than the row size, memory overwrites in thd->mem_root, which is
      bad.
      
      Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init()
      before it's used.
      
      There is no test case for MDEV-31893 as valgrind of join_cache_notasan
      checks that.
      I added a test case for MDEV-31348.
      2aea9387
    • Kristian Nielsen's avatar
      MDEV-23021: rpl.rpl_parallel_optimistic_until fails in Buildbot · b2e312b0
      Kristian Nielsen authored
      The test case accessed slave-relay-bin.000003 without waiting for the IO
      thread to write it first. If the IO thread was slow, this could fail.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      b2e312b0
    • Kristian Nielsen's avatar
      MDEV-381: fdatasync() does not correctly flush growing binlog file · 5055490c
      Kristian Nielsen authored
      Revert the old work-around for buggy fdatasync() on Linux ext3. This bug was
      fixed in Linux > 10 years ago back to kernel version at least 3.0.
      Reviewed-by: default avatarMarko Mäkelä <marko.makela@mariadb.com>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      5055490c
    • Monty's avatar
      MDEV-31893 Valgrind reports issues in main.join_cache_notasan · e9333ff0
      Monty authored
      This is also related to
      MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool
                 JOIN_CACHE_HASHED::put_record()
      
      Valgrind exposed a problem with the join_cache for hash joins:
      =25636== Conditional jump or move depends on uninitialised value(s)
      ==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table()
                (sql_join_cache.cc:2901)
      
      The reason for this was that avg_record_length contained a random value
      if one had used SET optimizer_switch='optimize_join_buffer_size=off'.
      
      This causes either 'random size' memory to be allocated (up to
      join_buffer_size) which can increase memory usage or, if avg_record_length
      is less than the row size, memory overwrites in thd->mem_root, which is
      bad.
      
      Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init()
      before it's used.
      
      There is no test case for MDEV-31893 as valgrind of join_cache_notasan
      checks that.
      I added a test case for MDEV-31348.
      e9333ff0
  17. 08 Aug, 2023 4 commits