1. 11 Jul, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-34274: Test rpl.rpl_change_master_demote frequently fails on buildbot... · fa804497
      Brandon Nesterenko authored
      MDEV-34274: Test rpl.rpl_change_master_demote frequently fails on buildbot with "IO thread should not be running..."
      
      Note this is a backport of 8c8b3ab7
      from 11.1.
      
      The test rpl.rpl_change_master_demote used a `sleep 1` command
      to give time for a START SLAVE UNTIL to start the slave threads
      and wait for them to automatically die by UNTIL.  On machines
      with heavy load (especially MSAN bb builders), one second was
      not enough, and the test would fail due to the IO thread
      still being up.
      
      This patch fixes the test by replacing the sleep with specific
      conditions to wait for. The test cannot wait for the IO or SQL
      threads to start, as it would be possible that they would be
      started and stopped by the time the MTR executor would check
      the slave status. So instead, we test for proof that they
      existed via the Connections status variable being incremented
      by at least 2 (Connections just shows the global thread id).
      At this point, we still can't use the wait_for_slave_to_stop
      helper, as the SQL/IO_Running fields of SHOW SLAVE STATUS
      may not be updated yet. So instead, we use
      information_schema.processlist, which would show the presence
      of the Slave_SQL/IO threads. So to "wait for the slave to stop",
      we wait for the Slave_SQL/IO threads to be gone from the
      processlist.
      fa804497
  2. 09 Jul, 2024 1 commit
  3. 08 Jul, 2024 6 commits
    • Alexander Barkov's avatar
      4d71a117
    • Alexander Barkov's avatar
      e56040fe
    • Alexander Barkov's avatar
      MDEV-34305 Redundant truncation errors/warnings with optimizer_trace enabled · d1e5fa89
      Alexander Barkov authored
      my_like_range*() can create longer keys than Field::char_length().
      This caused warnings during print_range().
      
      Fix:
      
      Suppressing warnings in print_range().
      d1e5fa89
    • Anson Chung's avatar
      Refactor GitLab cppcheck and update SAST ignorelists · df35072c
      Anson Chung authored
      Line numbers had to be removed from the ignorelists in order to be
      diffed against since locations of the same findings can differ
      across runs. Therefore preprocessing has to be done on the CI findings
      so that it can be compared to the ignorelist and new findings can be
      outputted. However, since line numbers have to be removed, a situation
      occurs where it is difficult to reference the location of findings
      in code given the output of the CI job.
      
      To lessen this pain, change the cppcheck template to include
      code snippets which make it easier to reference where in the code
      the finding is referring to, even in the absence of line numbers.
      Ignorelisting works as before since locations of the finding may
      change but not the code it is referring to.
      
      Furthermore, due to the innate difficulty in maintaining ignorelists
      across branches and triaging new findings, allow failure as to not
      have constantly failing pipelines as a result of a new findings that
      have not been addressed yet.
      
      Lastly, update SAST ignorelists to match the newly refactored cppcheck
      job and the current state of the codebase.
      
      All new code of the whole pull request, including one or several
      files that are either new files or modified ones, are contributed
      under the BSD-new license. I am contributing on behalf of my
      employer Amazon Web Services, Inc.
      df35072c
    • Anson Chung's avatar
      Perform simple fixes for cppcheck findings · 215fab68
      Anson Chung authored
      Rectify cases of mismatched brackets and address
      possible cases of division by zero by checking if
      the denominator is zero before dividing.
      
      No functional changes were made.
      
      All new code of the whole pull request, including one or several
      files that are either new files or modified ones, are contributed
      under the BSD-new license. I am contributing on behalf of my
      employer Amazon Web Services, Inc.
      215fab68
    • Marko Mäkelä's avatar
      MDEV-34510: UBSAN: overflow on adding an unsigned offset · 72ceae73
      Marko Mäkelä authored
      crc32_avx512(): Explicitly cast ssize_t(size) to make it clear that
      we are indeed applying a negative offset to a pointer.
      72ceae73
  4. 07 Jul, 2024 1 commit
    • Monty's avatar
      MDEV-34522 Index for (specific) Aria table is created as corrupted · 33964984
      Monty authored
      The issue was that when repairing an Aria table of row format PAGE and
      the data file was bigger the 4G, the data file length was cut short
      because of wrong parameters to MY_ALIGN().
      
      The effect was that ALTER TABLE, OPTIMIZE TABLE or REPAIR TABLE would fail
      on these tables, possibly corrupting them.
      The MDEV also exposed a bug where error state was not propagated properly
      to the upper level if the number of rows in the table changed.
      33964984
  5. 06 Jul, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-33465: an option to enable semisync recovery · eb4458e9
      Brandon Nesterenko authored
      The current semi-sync binlog fail-over recovery process uses
      rpl_semi_sync_slave_enabled==TRUE as its condition to truncate a
      primary server’s binlog, as it is anticipating the server to re-join
      a replication topology as a replica. However, for servers configured
      with both rpl_semi_sync_master_enabled=1 and
      rpl_semi_sync_slave_enabled=1, if a primary is just re-started (i.e.
      retaining its role as master), it can truncate its binlog to drop
      transactions which its replica(s) has already received and executed.
      If this happens, when the replica reconnects, its gtid_slave_pos can
      be ahead of the recovered primary’s gtid_binlog_pos, resulting in an
      error state where the replica’s state is ahead of the primary’s.
      
      This patch changes the condition for semi-sync recovery to truncate
      the binlog to instead use the configuration variable
      --init-rpl-role, when set to SLAVE. This allows for both
      rpl_semi_sync_master_enabled and rpl_semi_sync_slave_enabled to be
      set for a primary that is restarted, and no transactions will be
      lost, so long as --init-rpl-role is not set to SLAVE.
      
      Reviewed By:
      ============
      Sergei Golubchik <serg@mariadb.com>
      eb4458e9
  6. 05 Jul, 2024 3 commits
    • Brandon Nesterenko's avatar
      MDEV-25607: Auto-generated DELETE from HEAP table can break replication · cbc1898e
      Brandon Nesterenko authored
      The special logic used by the memory storage engine
      to keep slaves in sync with the master on a restart can
      break replication. In particular, after a restart, the
      master writes DELETE statements in the binlog for
      each MEMORY-based table so the slave can empty its
      data. If the DELETE is not executable, e.g. due to
      invalid triggers, the slave will error and fail, whereas
      the master will never see the problem.
      
      Instead of DELETE statements, use TRUNCATE to
      keep slaves in-sync with the master, thereby bypassing
      triggers.
      
      Reviewed By:
      ===========
      Kristian Nielsen <knielsen@knielsen-hq.org>
      Andrei Elkin <andrei.elkin@mariadb.com>
      cbc1898e
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-34519 innodb_log_checkpoint_now crashes when innodb_read_only is enabled · 834c013b
      Thirunarayanan Balathandayuthapani authored
      During read only mode, InnoDB doesn't allow checkpoint to happen.
      So InnoDB should throw the warning when InnoDB tries to
      force the checkpoint when innodb_read_only = 1 or
      innodb_force_recovery = 6.
      834c013b
    • Hugo Wen's avatar
      Fix a stack overflow in pinbox allocator · 9e8546e2
      Hugo Wen authored
      MariaDB supports a "wait-free concurrent allocator based on pinning addresses".
      In `lf_pinbox_real_free()` it tries to sort the pinned addresses for better
      performance to use binary search during "real free". `alloca()` was used to
      allocate stack memory and copy addresses.
      
      To prevent a stack overflow when allocating the stack memory the function checks
      if there's enough stack space. However, the available stack size was calculated
      inaccurately which eventually caused database crash due to stack overflow.
      
      The crash was seen on MariaDB 10.6.11 but the same code defect exists on all
      MariaDB versions.
      
      A similar issue happened previously and the fix in fc2c1e43 was to add a
      `ALLOCA_SAFETY_MARGIN` which is 8192 bytes. However, that safety margin is not
      enough during high connection workloads.
      
      MySQL also had a similar issue and the fix
      https://github.com/mysql/mysql-server/commit/b086fda was to remove the use of
      `alloca` and replace qsort approach by a linear scan through all pointers (pins)
      owned by each thread.
      
      This commit is mostly the same as it is the only way to solve this issue as:
      1. Frame sizes in different architecture can be different.
      2. Number of active (non-null) pinned addresses varies, so the frame
         size for the recursive sorting function `msort_with_tmp` is also hard
         to predict.
      3. Allocating big memory blocks in stack doesn't seem to be a very good
         practice.
      
      For further details see the mentioned commit in MySQL and the inline comments.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services, Inc.
      9e8546e2
  7. 04 Jul, 2024 6 commits
    • Sergei Petrunia's avatar
      Stabilize analyze_engine_stats2.test · e40d232a
      Sergei Petrunia authored
      e40d232a
    • Sergei Petrunia's avatar
      MDEV-34190: r_engine_stats.pages_read_count is unrealistically low · 513c8270
      Sergei Petrunia authored
      The symptoms were: take a server with no activity and a table that's
      not in the buffer pool. Run a query that reads the whole table and
      observe that r_engine_stats.pages_read_count shows about 2% of the table
      was read. Who reads the rest?
      
      The cause was that page prefetching done inside InnoDB was not counted.
      
      This counts page prefetch requests made in buf_read_ahead_random() and
      buf_read_ahead_linear() and makes them visible in:
      
      - ANALYZE: r_engine_stats.pages_prefetch_read_count
      - Slow Query Log: Pages_prefetched:
      
      This patch intentionally doesn't attempt to count the time to read the
      prefetched pages:
      * there's no obvious place where one can do it
      * prefetch reads may be done in parallel (right?), it is not clear how
        to count the time in this case.
      513c8270
    • Galina Shalygina's avatar
      MDEV-29363: Constant subquery causing a crash in pushdown optimization · 6cb896a6
      Galina Shalygina authored
      The crash is caused by the attempt to refix the constant subquery during
      pushdown from HAVING into WHERE optimization.
      
      Every condition that is going to be pushed into WHERE clause is first
      cleaned up, then refixed. Constant subqueries are not cleaned or refixed
      because they will remain the same after refixing, so this complicated
      procedure can be omitted for them (introduced in MDEV-21184).
      Constant subqueries are marked with flag IMMUTABLE_FL, that helps to miss
      the cleanup stage for them. Also they are marked as fixed, so refixing is
      also not done for them.
      Because of the multiple equality propagation several references to the same
      constant subquery can exist in the condition that is going to be pushed
      into WHERE. Before this patch, the problem appeared in the following way.
      After the first reference to the constant subquery is processed, the flag
      IMMUTABLE_FL for the constant subquery is disabled.
      So, when the second reference to this constant subquery is processed, the
      flag is already disabled and the subquery goes through the procedure of
      cleaning and refixing. That causes a crash.
      
      To solve this problem, IMMUTABLE_FL should be disabled only after all
      references to the constant subquery are processed, so after the whole
      condition that is going to be pushed is cleaned up and refixed.
      
      Approved by Igor Babaev <igor@maridb.com>
      6cb896a6
    • Oleksandr Byelkin's avatar
      Merge branch '10.6' into 10.11 · 034a1759
      Oleksandr Byelkin authored
      034a1759
    • Oleksandr Byelkin's avatar
    • Alexander Barkov's avatar
      MDEV-10865 COLLATE keyword doesn't work in PREPARE query · f6989d17
      Alexander Barkov authored
      Fixing applying the COLLATE clause to a parameter caused an error error:
        COLLATION '...' is not valid for CHARACTER SET 'binary'
      
      Fix:
      
      - Changing the collation derivation for a non-prepared Item_param
        to DERIVATION_IGNORABLE.
      
      - Allowing to apply any COLLATE clause to expressions with DERIVATION_IGNORABLE.
        This includes:
          1. A non-prepared Item_param
          2. An explicit NULL
          3. Expressions derived from #1 and #2
      
        For example:
          SELECT ? COLLATE utf8mb_unicode_ci;
          SELECT NULL COLLATE utf8mb_unicode_ci;
          SELECT CONCAT(?) COLLATE utf8mb_unicode_ci;
          SELECT CONCAT(NULL) COLLATE utf8mb_unicode_ci
      
      - Additional change: preserving the collation of an expression when
        the expression gets assigned to a PS parameter and evaluates to SQL NULL.
        Before this change, the collation of the parameter was erroneously set
        to &my_charset_binary.
      
      - Additional change: removing the multiplication to mbmaxlen from the
        fix_char_length_ulonglong() argument, because the multiplication already
        happens inside fix_char_length_ulonglong().
        This fixes a too large column size created for a COLLATE clause.
      f6989d17
  8. 03 Jul, 2024 6 commits
    • Brandon Nesterenko's avatar
      MDEV-9159: Bring back assert in semisync_master.cc · d58975bb
      Brandon Nesterenko authored
      In 10.0 there was an assert to ensure that there were semi
      sync clients before removing one, but it was removed in 10.1.
      This patch adds the assertion back.
      d58975bb
    • mariadb-DebarunBanerjee's avatar
      MDEV-34458 wait_for_read in buf_page_get_low hurts performance · 73ad436e
      mariadb-DebarunBanerjee authored
      The performance regression seen while loading BP is caused by the
      deadlock fix given in MDEV-33543. The area of impact is wider but is
      more visible when BP is being loaded initially via DMLs.  Specifically
      the response time could be impacted in DML doing pessimistic operation
      on index(split/merge) and the leaf pages are not found in buffer pool.
      It is more likely to occur with small BP size.
      
      The origin of the issue dates back to MDEV-30400 that introduced
      btr_cur_t::search_leaf() replacing btr_cur_search_to_nth_level() for
      leaf page searches. In btr_latch_prev, we use RW_NO_LATCH to get the
      previous page fixed in BP without latching. When the page is not in BP,
      we try to acquire and wait for S latch violating the latching order.
      
      This deadlock was analyzed in MDEV-33543 and fixed by using the already
      present wait logic in buf_page_get_gen() instead of waiting for latch.
      The wait logic is inferior to usual S latch wait and is simply a
      repeated sleep 100 of micro-sec (The actual sleep time could be more
      depending on platforms). The bug was seen with "change-buffering" code
      path and the idea was that this path should be less exercised. The
      judgement was not correct and the path is actually quite frequent and
      does impact performance when pages are not in BP and being loaded by
      DML expanding/shrinking large data.
      
      Fix: While trying to get a page with RW_NO_LATCH and we are attempting
      "out of order" latch, return from buf_page_get_gen immediately instead
      of waiting and follow the ordered latching path.
      73ad436e
    • Oleksandr Byelkin's avatar
      Merge branch '10.5' into 10.6 · dcd8a648
      Oleksandr Byelkin authored
      dcd8a648
    • Oleksandr Byelkin's avatar
      Fix compiler errors · a4ef05d0
      Oleksandr Byelkin authored
      a4ef05d0
    • Monty's avatar
      Added Lock_time_ms and Table_catalog columns to metadata_lock_info · c91ec6a5
      Monty authored
      If compiled for debugging, LOCK_DURATION is also filled in.
      c91ec6a5
    • Daniel Black's avatar
      MDEV-34502 InnoDB debug mode build - asserts with Valgrind · 25c6e3e4
      Daniel Black authored
      Valgrind looks as the assertions as examining uninitalized values.
      
      As the assertions are tested in other Debug builds we know
      it isn't all invalid.
      
      Account for Valgrind by removing the assertion under
      the WITH_VALGRIND=1 compulation.
      25c6e3e4
  9. 02 Jul, 2024 4 commits
    • Monty's avatar
      MDEV-34494 Add server_uid global variable and add it to error log at startup · 2739b5f5
      Monty authored
      The feedback plugin server_uid variable and the calculate_server_uid()
      function is moved from feedback/utils.cc to sql/mysqld.cc
      
      server_uid is added as a global variable (shown in 'show variables') and
      is written to the error log on server startup together with server version
      and server commit id.
      2739b5f5
    • Monty's avatar
      MDEV-34491 Setting log_slow_admin="" at startup should be converted to log_slow_admin=ALL · d8c9c5ea
      Monty authored
      We have an issue if a user have the following in a configuration file:
      log_slow_filter=""                  # Log everything to slow query log
      log_queries_not_using_indexes=ON
      
      This set log_slow_filter to 'not_using_index' which disables
      slow_query_logging of most queries.
      In effect, on should never use log_slow_filter="" in config files but
      instead use log_slow_filter=ALL.
      
      Fixed by changing log_slow_filter="" that comes either from a
      configuration file or from the command line, when starting to the server,
      to log_slow_filter=ALL.
      A warning will be printed when this happens.
      
      Other things:
      - One can now use =ALL for any 'set' variable to set all options at once.
        (backported from 10.6)
      d8c9c5ea
    • Daniel Black's avatar
      MDEV-34437: handle error on getaddrinfo · 243dee74
      Daniel Black authored
      When getaddrinfo returns and error, the contents
      of ai are invalid so we cannot continue based
      on their data structures.
      
      In the previous branch of the if statement, we
      abort there if there is an error so for consistency
      we abort here too.
      
      The test case fixes the port number to UINTMAX32
      for both an enumberated bind-address and the
      default bind-address covering the two calls to
      getaddrinfo.
      
      Review thanks Sanja.
      243dee74
    • Lena Startseva's avatar
  10. 01 Jul, 2024 4 commits
  11. 29 Jun, 2024 2 commits
  12. 28 Jun, 2024 3 commits
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.11 · 1d76794a
      Marko Mäkelä authored
      1d76794a
    • Marko Mäkelä's avatar
      MDEV-32176 Contention in ha_innobase::info_low() · d1ecf5cc
      Marko Mäkelä authored
      During a Sysbench oltp_point_select workload with 1 table and 400
      concurrent connections, a bottleneck on dict_table_t::lock_mutex was
      observed in ha_innobase::info_low().
      
      dict_table_t::lock_latch: Replaces lock_mutex.
      
      In ha_innobase::info_low() and several other places, we will acquire
      a shared dict_table_t::lock_latch or we may elide the latch if
      hardware memory transactions are available.
      
      innobase_build_v_templ(): Remove the parameter "bool locked", and
      require the caller to hold exclusive dict_table_t::lock_latch
      (instead of holding an exclusive dict_sys.latch).
      
      Tested by: Vladislav Vaintroub
      Reviewed by: Vladislav Vaintroub
      d1ecf5cc
    • Lena Startseva's avatar
  13. 27 Jun, 2024 2 commits
    • Marko Mäkelä's avatar
      MDEV-33894: Resurrect innodb_log_write_ahead_size · 4ca355d8
      Marko Mäkelä authored
      As part of commit 685d958e (MDEV-14425)
      the parameter innodb_log_write_ahead_size was removed, because it was
      thought that determining the physical block size would be a sufficient
      replacement.
      
      However, we can only determine the physical block size on Linux or
      Microsoft Windows. On some file systems, the physical block size
      is not relevant. For example, XFS uses a block size of 4096 bytes
      even if the underlying block size may be smaller.
      
      On Linux, we failed to determine the physical block size if
      innodb_log_file_buffered=OFF was not requested or possible.
      This will be fixed.
      
      log_sys.write_size: The value of the reintroduced parameter
      innodb_log_write_ahead_size. To keep it simple, this is read-only
      and a power of two between 512 and 4096 bytes, so that the previous
      alignment guarantees are fulfilled. This will replace the previous
      log_sys.get_block_size().
      
      log_sys.block_size, log_t::get_block_size(): Remove.
      
      log_t::set_block_size(): Ensure that write_size will not be less
      than the physical block size. There is no point to invoke this
      function with 512 or less, because that is the minimum value of
      write_size.
      
      innodb_params_adjust(): Add some disabled code for adjusting
      the minimum value and default value of innodb_log_write_ahead_size
      to reflect the log_sys.write_size.
      
      log_t::set_recovered(): Mark the recovery completed. This is the
      place to adjust some things if we want to allow write_size>4096.
      
      log_t::resize_write_buf(): Refer to write_size.
      
      log_t::resize_start(): Refer to write_size instead of get_block_size().
      
      log_write_buf(): Simplify some arithmetics and remove a goto.
      
      log_t::write_buf(): Refer to write_size. If we are writing less than
      that, do not switch buffers, but keep writing to the same buffer.
      Move some code to improve the locality of reference.
      
      recv_scan_log(): Refer to write_size instead of get_block_size().
      
      os_file_create_func(): For type==OS_LOG_FILE on Linux, always invoke
      os_file_log_maybe_unbuffered(), so that log_sys.set_block_size() will
      be invoked even if we are not attempting to use O_DIRECT.
      
      recv_sys_t::find_checkpoint(): Read the entire log header
      in a single 12 KiB request into log_sys.buf.
      
      Tested with:
      ./mtr --loose-innodb-log-write-ahead-size=4096
      ./mtr --loose-innodb-log-write-ahead-size=2048
      4ca355d8
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.11 · 27a33666
      Marko Mäkelä authored
      27a33666