1. 20 Mar, 2023 7 commits
  2. 19 Mar, 2023 1 commit
  3. 17 Mar, 2023 11 commits
  4. 16 Mar, 2023 21 commits
    • Sergei Petrunia's avatar
      MDEV-30442: Assertion `!m_innodb' failed in ha_partition::cmp_ref ... · 090e5d8b
      Sergei Petrunia authored
      The failed assertion was about encountering the same rowid value in
      two different partitions.
      This wasn't possible with InnoDB previously: InnoDB used a global counter
      to produce rowid values for hidden PK.
      
      After the fix for MDEV-19506, it uses per-table counters so it's easily
      possible to get the same hidden PK values in different tables.
      090e5d8b
    • Sergei Petrunia's avatar
      MDEV-30693: Assertion `dbl_records <= s->records' failed in apply_selectivity_for_table on SELECT · ef5bb081
      Sergei Petrunia authored
      The crash happened due to rows=2 vs rows=1 difference between how the
      estimate of number of rows in a derived table is computed in
      TABLE_LIST::fetch_number_of_rows() and JOIN::add_keyuses_for_splitting().
      
      Made JOIN::add_keyuses_for_splitting() use the result of computations in
      TABLE_LIST::fetch_number_of_rows().
      ef5bb081
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.8 · acf46b7b
      Marko Mäkelä authored
      acf46b7b
    • Marko Mäkelä's avatar
      MDEV-26827 Make page flushing even faster · a55b951e
      Marko Mäkelä authored
      For more convenient monitoring of something that could greatly affect
      the volume of page writes, we add the status variable
      Innodb_buffer_pool_pages_split that was previously only available
      via information_schema.innodb_metrics as "innodb_page_splits".
      This was suggested by Axel Schwenke.
      
      buf_flush_page_count: Replaced with buf_pool.stat.n_pages_written.
      We protect buf_pool.stat (except n_page_gets) with buf_pool.mutex
      and remove unnecessary export_vars indirection.
      
      buf_pool.flush_list_bytes: Moved from buf_pool.stat.flush_list_bytes.
      Protected by buf_pool.flush_list_mutex.
      
      buf_pool_t::page_cleaner_status: Replaces buf_pool_t::n_flush_LRU_,
      buf_pool_t::n_flush_list_, and buf_pool_t::page_cleaner_is_idle.
      Protected by buf_pool.flush_list_mutex. We will exclusively broadcast
      buf_pool.done_flush_list by the buf_flush_page_cleaner thread,
      and only wait for it when communicating with buf_flush_page_cleaner.
      There is no need to keep a count of pending writes by the
      buf_pool.flush_list processing. A single flag suffices for that.
      
      Waits for page write completion can be performed by
      simply waiting on block->page.lock, or by invoking
      buf_dblwr.wait_for_page_writes().
      
      buf_LRU_block_free_non_file_page(): Broadcast buf_pool.done_free and
      set buf_pool.try_LRU_scan when freeing a page. This would be
      executed also as part of buf_page_write_complete().
      
      buf_page_write_complete(): Do not broadcast buf_pool.done_flush_list,
      and do not acquire buf_pool.mutex unless buf_pool.LRU eviction is needed.
      Let buf_dblwr count all writes to persistent pages and broadcast a
      condition variable when no outstanding writes remain.
      
      buf_flush_page_cleaner(): Prioritize LRU flushing and eviction right after
      "furious flushing" (lsn_limit). Simplify the conditions and reduce the
      hold time of buf_pool.flush_list_mutex. Refuse to shut down
      or sleep if buf_pool.ran_out(), that is, LRU eviction is needed.
      
      buf_pool_t::page_cleaner_wakeup(): Add the optional parameter for_LRU.
      
      buf_LRU_get_free_block(): Protect buf_lru_free_blocks_error_printed
      with buf_pool.mutex. Invoke buf_pool.page_cleaner_wakeup(true) to
      to ensure that buf_flush_page_cleaner() will process the LRU flush
      request.
      
      buf_do_LRU_batch(), buf_flush_list(), buf_flush_list_space():
      Update buf_pool.stat.n_pages_written when submitting writes
      (while holding buf_pool.mutex), not when completing them.
      
      buf_page_t::flush(), buf_flush_discard_page(): Require that
      the page U-latch be acquired upfront, and remove
      buf_page_t::ready_for_flush().
      
      buf_pool_t::delete_from_flush_list(): Remove the parameter "bool clear".
      
      buf_flush_page(): Count pending page writes via buf_dblwr.
      
      buf_flush_try_neighbors(): Take the block of page_id as a parameter.
      If the tablespace is dropped before our page has been written out,
      release the page U-latch.
      
      buf_pool_invalidate(): Let the caller ensure that there are no
      outstanding writes.
      
      buf_flush_wait_batch_end(false),
      buf_flush_wait_batch_end_acquiring_mutex(false):
      Replaced with buf_dblwr.wait_for_page_writes().
      
      buf_flush_wait_LRU_batch_end(): Replaces buf_flush_wait_batch_end(true).
      
      buf_flush_list(): Remove some broadcast of buf_pool.done_flush_list.
      
      buf_flush_buffer_pool(): Invoke also buf_dblwr.wait_for_page_writes().
      
      buf_pool_t::io_pending(), buf_pool_t::n_flush_list(): Remove.
      Outstanding writes are reflected by buf_dblwr.pending_writes().
      
      buf_dblwr_t::init(): New function, to initialize the mutex and
      the condition variables, but not the backing store.
      
      buf_dblwr_t::is_created(): Replaces buf_dblwr_t::is_initialised().
      
      buf_dblwr_t::pending_writes(), buf_dblwr_t::writes_pending:
      Keeps track of writes of persistent data pages.
      
      buf_flush_LRU(): Allow calls while LRU flushing may be in progress
      in another thread.
      
      Tested by Matthias Leich (correctness) and Axel Schwenke (performance)
      a55b951e
    • Marko Mäkelä's avatar
      MDEV-26055: Improve adaptive flushing · 9593cccf
      Marko Mäkelä authored
      Adaptive flushing is enabled by setting innodb_max_dirty_pages_pct_lwm>0
      (not default) and innodb_adaptive_flushing=ON (default).
      There is also the parameter innodb_adaptive_flushing_lwm
      (default: 10 per cent of the log capacity). It should enable some
      adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0.
      That is not being changed here.
      
      This idea was first presented by Inaam Rana several years ago,
      and I discussed it with Jean-François Gagné at FOSDEM 2023.
      
      buf_flush_page_cleaner(): When we are not near the log capacity limit
      (neither buf_flush_async_lsn nor buf_flush_sync_lsn are set),
      also try to move clean blocks from the buf_pool.LRU list to buf_pool.free
      or initiate writes (but not the eviction) of dirty blocks, until
      the remaining I/O capacity has been consumed.
      
      buf_flush_LRU_list_batch(): Add the parameter bool evict, to specify
      whether dirty least recently used pages (from buf_pool.LRU) should
      be evicted immediately after they have been written out. Callers outside
      buf_flush_page_cleaner() will pass evict=true, to retain the existing
      behaviour.
      
      buf_do_LRU_batch(): Add the parameter bool evict.
      Return counts of evicted and flushed pages.
      
      buf_flush_LRU(): Add the parameter bool evict.
      Assume that the caller holds buf_pool.mutex and
      will invoke buf_dblwr.flush_buffered_writes() afterwards.
      
      buf_flush_list_holding_mutex(): A low-level variant of buf_flush_list()
      whose caller must hold buf_pool.mutex and invoke
      buf_dblwr.flush_buffered_writes() afterwards.
      
      buf_flush_wait_batch_end_acquiring_mutex(): Remove. It is enough to have
      buf_flush_wait_batch_end().
      
      page_cleaner_flush_pages_recommendation(): Avoid some floating-point
      arithmetics.
      
      buf_flush_page(), buf_flush_check_neighbor(), buf_flush_check_neighbors(),
      buf_flush_try_neighbors(): Rename the parameter "bool lru" to "bool evict".
      
      buf_free_from_unzip_LRU_list_batch(): Remove the parameter.
      Only actual page writes will contribute towards the limit.
      
      buf_LRU_free_page(): Evict freed pages of temporary tables.
      
      buf_pool.done_free: Broadcast whenever a block is freed
      (and buf_pool.try_LRU_scan is set).
      
      buf_pool_t::io_buf_t::reserve(): Retry indefinitely.
      During the test encryption.innochecksum we easily run out of
      these buffers for PAGE_COMPRESSED or ENCRYPTED pages.
      
      Tested by Matthias Leich and Axel Schwenke
      9593cccf
    • Marko Mäkelä's avatar
      MDEV-30357 Performance regression in locking reads from secondary indexes · 4105017a
      Marko Mäkelä authored
      lock_sec_rec_some_has_impl(): Remove a harmful condition that caused the
      performance regression and should not have been added in
      commit b6e41e38 in the first place.
      Locking transactions that have not modified any persistent tables
      can carry the transaction identifier 0.
      
      trx_t::max_inactive_id: A cache for trx_sys_t::find_same_or_older().
      The value is not reset on transaction commit so that previous results
      can be reused for subsequent transactions. The smallest active
      transaction ID can only increase over time, not decrease.
      
      trx_sys_t::find_same_or_older(): Remember the maximum previous id for which
      rw_trx_hash.iterate() returned false, to avoid redundant iterations.
      
      lock_sec_rec_read_check_and_lock(): Add an early return in case we are
      already holding a covering table lock.
      
      lock_rec_convert_impl_to_expl(): Add a template parameter to avoid
      a redundant run-time check on whether the index is secondary.
      
      lock_rec_convert_impl_to_expl_for_trx(): Move some code from
      lock_rec_convert_impl_to_expl(), to reduce code duplication due
      to the added template parameter.
      
      Reviewed by: Vladislav Lesin
      Tested by: Matthias Leich
      4105017a
    • Marko Mäkelä's avatar
      MDEV-29835 InnoDB hang on B-tree split or merge · f2096478
      Marko Mäkelä authored
      This is a follow-up to
      commit de4030e4 (MDEV-30400),
      which fixed some hangs related to B-tree split or merge.
      
      btr_root_block_get(): Use and update the root page guess. This is just
      a minor performance optimization, not affecting correctness.
      
      btr_validate_level(): Remove the parameter "lockout", and always
      acquire an exclusive dict_index_t::lock in CHECK TABLE without QUICK.
      This is needed in order to avoid latching order violation in
      btr_page_get_father_node_ptr_for_validate().
      
      btr_cur_need_opposite_intention(): Return true in case
      btr_cur_compress_recommendation() would hold later during the
      mini-transaction, or if a page underflow or overflow is possible.
      If we return true, our caller will escalate to aqcuiring an exclusive
      dict_index_t::lock, to prevent a latching order violation and deadlock
      during btr_compress() or btr_page_split_and_insert().
      
      btr_cur_t::search_leaf(), btr_cur_t::open_leaf():
      Also invoke btr_cur_need_opposite_intention() on the leaf page.
      
      btr_cur_t::open_leaf(): When escalating to exclusive index locking,
      acquire exclusive latches on all pages as well.
      
      innobase_instant_try(): Return an error code if the root page cannot
      be retrieved.
      
      In addition to the normal stress testing with Random Query Generator (RQG)
      this has been tested with
      ./mtr --mysqld=--loose-innodb-limit-optimistic-insert-debug=2
      but with the injection in btr_cur_optimistic_insert() for non-leaf pages
      adjusted so that it would use the value 3. (Otherwise, infinite page
      splits could occur in some mtr tests.)
      
      Tested by: Matthias Leich
      f2096478
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 85cbfaef
      Marko Mäkelä authored
      85cbfaef
    • Marko Mäkelä's avatar
      MDEV-30860 Race condition between buffer pool flush and log file deletion in... · 1495f057
      Marko Mäkelä authored
      MDEV-30860 Race condition between buffer pool flush and log file deletion in mariadb-backup --prepare
      
      srv_start(): If we are going to close the log file in
      mariadb-backup --prepare, call buf_flush_sync() before
      calling recv_sys.debug_free() to ensure that the log file
      will not be accessed.
      
      This fixes a rather rare failure in the test
      mariabackup.innodb_force_recovery where buf_flush_page_cleaner()
      would invoke log_checkpoint_low() because !recv_recovery_is_on()
      would hold due to the fact that recv_sys.debug_free() had
      already been called. Then, the log write for the checkpoint
      would fail because srv_start() had invoked log_sys.log.close_file().
      1495f057
    • Dmitry Shulga's avatar
      MDEV-30811: Build issues on macOS 11.0 · 1310b3a0
      Dmitry Shulga authored
      Building of MariaDB server version 11.0 and 11.1 fails
      on MacOS 12.x (Monterey). Build failure happened on generating
      header files from the error messages contained in the file
      errmsg-utf8.txt. This process is performed by the utility comp_err
      that crashes when it is run on MacOS Monterey.
      
      comp_err invokes my_init at the very beginning of start before
      initialization of thread environment done. While executing my_init
      the function my_readlink is called. my_readlink is wrapper around
      the system call readlink with extra errors handling. In case the
      system call readlink returns error the following block of code is run
        if (my_thread_var)
          my_errno= errno;
      
      my_thred_var is macros that expanded to invocation of
      my_pthread_getspecific() against supplied thread specific key
      THR_KEY_mysys. Unfortunately, the tsd key THR_KEY_mysys is initialized
      right after the call of my_init() so return value of pthread_getspecific
      is platform dependent. On Linux pthread_getspecific returns NULL
      if key is not a valid TSD key. On MacOS, the effect of calling
      pthread_getspecific() with a key value not obtained from pthread_key_create()
      is undefined. So, on MacOS pthread_getspecific() returns some invalid address
      where the errno value is written. It leads to a crash latter when the library
      API function pthread_self() is called.
      
      To fix the issue, initialization of thread environment is moved at very
      beginning of the main() function in order to run it as the first step
      to have full initiliazed environment at the moment when my_init()
      is ivoked.
      1310b3a0
    • Igor Babaev's avatar
    • Lena Startseva's avatar
      MDEV-29390: Improve coverage for UPDATE and DELETE statements in MTR test suites · 1e0a72a1
      Lena Startseva authored
      Created tests for "delete" based on update_use_source.test
      
      For the update_use_source.test tests, data recovery in the table has been changed
      from a rollback transaction to a complete delete and re-insert of the data with
      optimize table. Cases are now being checked on three engines.
      
      Added tests for update/delete with LooseScan and DuplicateWeedout optimization strategies
      Added tests for engine MEMORY on delete and update
      Added tests for multi-update with JSON_TABLE
      Added tests for multi-update and multi-delete for engine Connect
      1e0a72a1
    • Igor Babaev's avatar
      9a3fd1df
    • Igor Babaev's avatar
      Fixes of MDEV-30538 and MDEV-30586 for 10.4 adjusted for 11.0. · c912fd3b
      Igor Babaev authored
      The commits for MDEV-30538 and MDEV-30586 could not be cherry-picked into
      11.0 separately.
      c912fd3b
    • Igor Babaev's avatar
      MDEV-7487 Semi-join optimization for single-table update/delete statements · 554278e2
      Igor Babaev authored
      This patch allows to use semi-join optimization at the top level of
      single-table update and delete statements.
      The problem of supporting such optimization became easy to resolve after
      processing a single-table update/delete statement started using JOIN
      structure. This allowed to use JOIN::prepare() not only for multi-table
      updates/deletes but for single-table ones as well. This was done in the
      patch for mdev-28883:
      Re-design the upper level of handling UPDATE and DELETE statements.
      
      Note that JOIN::prepare() detects all subqueries that can be considered
      as candidates for semi-join optimization. The code added by this patch
      looks for such candidates at the top level and if such candidates are found
      in the processed single-table update/delete the statement is handled in
      the same way as a multi-table update/delete.
      
          Approved by Oleksandr Byelkin <sanja@mariadb.com>
      554278e2
    • Igor Babaev's avatar
    • Igor Babaev's avatar
      MDEV-29428 Incorrect result for delete with "order by" clause · c22f7e8e
      Igor Babaev authored
      ORDER BY clause without LIMIT clause can be removed from DELETE statements.
      c22f7e8e
    • Igor Babaev's avatar
      ee495b22
    • Igor Babaev's avatar
      Applied the changes introduced in the commit · 11701780
      Igor Babaev authored
      92a32809
      Author:	Oleksandr Byelkin <sanja@mariadb.com>  Tue Jul 12 00:25:08 2022
      Committer:	Oleksandr Byelkin <sanja@mariadb.com>  Thu Jul 14 00:46:06 2022
      
      for the code of MDEV-28883.
      11701780
    • Igor Babaev's avatar
      MDEV-29189 Crash of the second execution of SF using DELETE/UPDATE · 24f75b7f
      Igor Babaev authored
      This bug caused a crash of the server at the second execution of a stored
      function that used DELETE or UPDATE statement if the first execution
      of this function reported an error encountered after the prepare phase.
      This happened because in such cases the executed DELETE/UPDATE statement
      remained marked as prepared. As a result the second execution of SF missed
      the prepare phase for the statement altogether and the statement could not
      be executed properly.
      
      Approved by Oleksandr Byelkin <sanja@mariadb.com>
      24f75b7f
    • Igor Babaev's avatar
      Assertion failure with UPDATE of view using MERGE table · 9f796526
      Igor Babaev authored
      The problem was caused by an assertion that is not valid anymore.
      9f796526