1. 06 May, 2024 1 commit
    • Yuchen Pei's avatar
      MDEV-30929 spider.spider_fixes_part: wait and restart slave · 64314d30
      Yuchen Pei authored
      In the absence of insight of the cause of spider.spider_fixes_part
      failure as described in MDEV-30929, This is a workaround, which could
      help narrow the possibility down to whether slave SQL thread attempts
      to read from file that maybe not yet on disk. It does not otherwise
      affect the coverage of the test.
      
      I have pushed this commit 4 times, but have yet to encounter the
      failure as described in MDEV-30929, so it could also fix the test and
      stop the CI pollution.
      
      Also replaced START SLAVE; with --source include/start_slave.inc
      inside the slave_test_init.inc files.
      64314d30
  2. 02 May, 2024 2 commits
    • Sergei Golubchik's avatar
      sporadic failures of binlog_encryption.rpl_parallel_gco_wait_kill · 3ee6f69d
      Sergei Golubchik authored
      CURRENT_TEST: binlog_encryption.rpl_parallel_gco_wait_kill
      mysqltest: In included file "./suite/rpl/t/rpl_parallel_gco_wait_kill.test":
      included from /home/buildbot/amd64-ubuntu-2004-debug/build/mysql-test/suite/binlog_encryption/rpl_parallel_gco_wait_kill.test at line 2:
      At line 334: Can't initialize replace from 'replace_result $thd_id THD_ID'
      
      An sql thread can reach the "Slave has read all relay log" state
      and then start reading relay log again. Let's use a more generic
      pattern to retrieve the sql thread ID even if it's not
      in the "read all relay log" state.
      3ee6f69d
    • Sergei Golubchik's avatar
      fix sporadic failures of main.lock_sync · 9dfef3fb
      Sergei Golubchik authored
      wait for all connections to disconnect before the cleanup
      9dfef3fb
  3. 30 Apr, 2024 7 commits
  4. 29 Apr, 2024 3 commits
    • Sergei Golubchik's avatar
      Merge branch '10.5' into 10.6 · c1f3eff5
      Sergei Golubchik authored
      c1f3eff5
    • Yuchen Pei's avatar
      MDEV-30727 Check spider_hton_ptr in spider udfs · 267dd5a9
      Yuchen Pei authored
      We have to #undef my_error and find it from udfs when spider is not
      installed.
      267dd5a9
    • mariadb-DebarunBanerjee's avatar
      MDEV-33669 mariabackup --backup hangs · 52f6df99
      mariadb-DebarunBanerjee authored
      This is a server hang and not an issue with backup. While concurrent
      DDLs in server gets in hanged state, mariabackup waits for DDLs to
      finish trying to acquire MDL_BACKUP_BLOCK_DDL.
      
      The server hang is serious in nature and caused by thread pool state
      being incorrectly set to thread creation pending state while no creation
      is actually pending. Once a thread pool reaches such state no new thread
      gets created in the pool.
      
      While it could possibly affect all thread pools in server, the innodb
      thread pool is the victim in current bug where IO job gets blocked when
      the pool is stuck with much less number of threads than intended.
      Available workers are blocked in purge waiting for page lock to be
      released by IO write (SX lock) causing a complete deadlock.
      
      The issue is caused by the state variable m_thread_creation_pending
      introduced by MDEV-31095: 9e62ab7a. We check and set the variable
      early while attempting to create a new thread in pool but fail to reset
      it if we exit the flow for other reasons like maximum threads reached
      or get into thread creation throttling path.
      
      Fix: The simple fix is to make sure that the state is reset back in case
      we don't actually attempt to create the thread.
      52f6df99
  5. 28 Apr, 2024 2 commits
  6. 27 Apr, 2024 1 commit
    • Alexander Barkov's avatar
      MDEV-33534 UBSAN: Negation of -X cannot be represented in type 'long long... · 3141a68b
      Alexander Barkov authored
      MDEV-33534 UBSAN: Negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_double_round from sql/item_func.cc|
      
      The negation in this line:
        ulonglong abs_dec= dec_negative ? -dec : dec;
      did not take into account that 'dec' can be the smallest possible
      signed negative value -9223372036854775808. Its negation is
      an operation with an undefined behavior.
      
      Fixing the code to use Longlong_hybrid, which implements a safe
      method to get an absolute value.
      3141a68b
  7. 26 Apr, 2024 5 commits
    • Sergei Golubchik's avatar
      sporadic failures of rpl.rpl_parallel_multi_domain_xa · 7ff64931
      Sergei Golubchik authored
      it's a slow test, the slave needs to catch up, reading >1500
      transactions. A default MASTER_GTID_WAIT() timeout in
      sync_with_master_gtid.inc is 120 seconds, which might be not
      enough for a slow/overloaded slave.
      
      Let's wait forever or until ./mtr --testcase-timeout,
      whatever comes first.
      7ff64931
    • Hugo Wen's avatar
      MDEV-33574 Improve mysqlbinlog error message · 3d417476
      Hugo Wen authored
      Previously, when running mysqlbinlog without providing a binlog file, it
      would print the entire help text, which was very verbose and made it
      difficult to identify the actual issue.
      
      Now change the behavior to print a more concise error message instead:
      
          "ERROR: Please provide the log file(s). Run with '--help' for usage instructions."
      
      This makes the error output more user-friendly and easier to understand,
      especially when running the tool in scripts or automated processes.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer
      Amazon Web Services, Inc.
      3d417476
    • Daniele Sciascia's avatar
      Fixup 0ccdf54b · ef7a2344
      Daniele Sciascia authored
      0ccdf54b removed stack allocated THD objects from functions
      Wsrep_schema::replay_transaction(). However, it inadvertedly
      anticipated the destruction of the THD, causing assertions and usage
      of THD after it was destroyed.
      The fix consists in extracting the original function into a separate
      function, and leave the allocation and destruction of the THD object
      in Wsrep_schema::replay_transaction(), making sure that using the heap
      allocated THD has no side effects.
      Same for Wsrep_schema::recover_sr_transactions().
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      ef7a2344
    • Sergei Golubchik's avatar
      MDEV-33492 fix installation of rpm/deb packages · 22a69c78
      Sergei Golubchik authored
      followup for 02715174
      22a69c78
    • Oleksandr Byelkin's avatar
      Merge branch '10.6' into 10.11 · c9b1ebee
      Oleksandr Byelkin authored
      c9b1ebee
  8. 25 Apr, 2024 8 commits
    • Jan Lindström's avatar
      MDEV-33896 : Galera test failure on galera_3nodes.MDEV-29171 · b3e531a3
      Jan Lindström authored
      Based on logs we might start SST before donor has reached
      Primary state. Because this test shutdowns all nodes we
      need to make sure when we start nodes that previous nodes
      have reached Primary state and joined the cluster.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      b3e531a3
    • Marko Mäkelä's avatar
      MDEV-26450 fixup: Remove a bogus assertion · 10d251e0
      Marko Mäkelä authored
      mtr_t::commit_shrink(): Do not assert that some previously clean pages
      will be flagged as modified by this mini-transaction. It could be the
      case that there had been no recent write-back of any of the undo
      tablespace pages that we are modifying when truncating the tablespace.
      It suffices to assert that some pages were modified again:
      ut_ad(m_modifications).
      
      This fixes up commit f5fddae3
      10d251e0
    • Sergei Golubchik's avatar
      sporadic failures of rpl.rpl_parallel_sbm · 9e925820
      Sergei Golubchik authored
      the test waits for the event to get stuck on MASTER_DELAY,
      but on a slow/overloaded slave the event might pass MASTER_DELAY
      before the test starts waiting.
      
      Wait for the event to get stuck on the LOCK TABLES (after MASTER_DELAY),
      the event cannot avoid that,
      9e925820
    • Marko Mäkelä's avatar
      MDEV-33993 Possible server hang on DROP INDEX or RENAME INDEX · 0936c138
      Marko Mäkelä authored
      commit_try_norebuild(): Add the parameter statistics_exist,
      similar to commit_try_rebuild(). If the InnoDB statistics tables
      did not exist, we will not attempt to update statistics later on
      during the transaction.
      
      Thanks to Matthias Leich for originally reproducing this scenario.
      0936c138
    • Kristian Nielsen's avatar
      MDEV-33602: Sporadic test failure in rpl.rpl_gtid_stop_start · 553a4d62
      Kristian Nielsen authored
      The test could fail with a duplicate key error because switching to non-GTID
      mode could start at the wrong old-style position. The position could be
      wrong when the previous GTID connect was stopped before receiving the fake
      GTID list event which gives the old-style position corresponding to the GTID
      connected position.
      
      Work-around by injecting an extra event and syncing the slave before
      switching to non-GTID mode.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      553a4d62
    • Marko Mäkelä's avatar
      MDEV-33974 Enable GNU libstdc++ debugging · a1c1f502
      Marko Mäkelä authored
      Starting with GCC 10, let us enable _GLIBCXX_DEBUG as well as
      _GLIBCXX_ASSERTIONS which have an impact on the GNU libstdc++.
      On GCC 8, we observed a compilation failure related to some
      missing type conversion.
      
      Even though clang on GNU/Linux would default to using libstdc++
      and enabling the debugging seems to work with clang-18, we will
      not enable this on clang, in case it would lead to compilation
      errors.
      
      For the clang libc++ before clang-15 there was _LIBCPP_DEBUG,
      but according to
      llvm/llvm-project@f3966eaf869b7bdd9113ab9d5b78469eb0f5f028 and
      llvm/llvm-project@13ea1343231fa4ae12fe9fba4c789728465783d7 and
      llvm/llvm-project@ff573a42cd1f1d05508f165dc3e645a0ec17edb5 it
      looks like that for proper results, a specially built debug version
      of libc++ would have to be used in order to enable equivalent checks.
      
      This should help catch bugs like the one that
      commit 455a15fd fixed.
      
      Reviewed by: Sergei Golubchik
      a1c1f502
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-33979 Disallow bulk insert operation during partition update statement · 8c8b7da0
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
      - Partition update operation enables the bulk insert for the
      transaction while moving the row between partitions. This leads
      to debug assert failure while removing the row from one
      of the partition.
      
      Solution:
      ========
      - Disallow the bulk insert operation for non-insert operation
      of partition table.
      8c8b7da0
    • Marko Mäkelä's avatar
      MDEV-23974 fixup: Cover all debug builds · 72293842
      Marko Mäkelä authored
      While commit 75b7cd68 was a significant
      improvement, we occasionally got test failures of debug builds. One of
      the affected tests is innodb.innodb-64k-crash.
      72293842
  9. 24 Apr, 2024 7 commits
  10. 23 Apr, 2024 4 commits
    • Meng-Hsiu Chiang's avatar
      MDEV-29955: Set path for zlib library with pkg-config · 55cb2c29
      Meng-Hsiu Chiang authored
      `FindZLIB` module uses variable `ZLIB_ROOT`[1] to look for libraries. By
      setting the variable, `FindZLIB` is able to search the libraries that
      installed in a non-system path (/workspace/mylib for example).
      
      And when using `z` in `LINK_LIBRARIES()` CMake tries to lookup the
      library in system path by default. It doesn't work if the library isn't
      installed in the path, and use ${ZLIB_LIBRARY} which set by FindZLIB
      solve the issue.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services.
      
      [1]: https://cmake.org/cmake/help/latest/module/FindZLIB.html#hints
      55cb2c29
    • Monty's avatar
      Check and remove high stack usage · 0ccdf54b
      Monty authored
      I checked all stack overflow potential problems found with
      gcc -Wstack-usage=16384
      and
      clang -Wframe-larger-than=16384 -no-inline
      
      Fixes:
      Added '#pragma clang diagnostic ignored "-Wframe-larger-than="'
        to a lot of function to where stack usage large but resonable.
      - Added stack check warnings to BUILD scrips when using clang and debug.
      
      Function changed to use malloc instead allocating things on stack:
      - read_bootstrap_query() now allocates line_buffer (20000 bytes) with
        malloc() instead of using stack. This has a small performance impact
        but this is not releant for bootstrap.
      - mroonga grn_select() used 65856 bytes on stack. Changed it to use
        malloc().
      - Wsrep_schema::replay_transaction() and
        Wsrep_schema::recover_sr_transactions().
      - Connect zipOpen3()
      
      Not fixed:
      - mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses
        43712 byte on stack.  However this is not easy to fix as the stack
        used is caused by a lot of code generated by defines.
      - Most changes in mroonga/groonga where only adding of pragmas to disable
        stack warnings.
      - rocksdb/options/options_helper.cc uses 20288 of stack space.
        (no reason to fix except to get rid of the compiler warning)
      - Causes using alloca() where the allocation size is resonable.
      - An issue in libmariadb (reported to connectors).
      0ccdf54b
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-33970 Assertion `!m.first->second.is_bulk_insert()' failed in trx_undo_report_row_operation() · c3460e69
      Thirunarayanan Balathandayuthapani authored
      In case of partition insert, InnoDB fails to end the bulk insert
      for one of the partition. It leads to bulk insert operation for
      the consecutive delete statement.
      
      trx_t::bulk_insert_apply_for_table(): Irrespective of bulk insert
      value, InnoDB should end the bulk insert for the table.
      c3460e69
    • Marko Mäkelä's avatar
      07faba08