1. 24 Mar, 2023 1 commit
    • Otto Kekalainen's avatar
      Fix trivial spelling errors · 50c8ef01
      Otto Kekalainen authored
      - agressively -> aggressively
      - exising -> existing
      - occured -> occurred
      - releated -> related
      - seperated -> separated
      - sucess -> success
      - use use -> use
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services, Inc.
      50c8ef01
  2. 20 Mar, 2023 5 commits
  3. 19 Mar, 2023 1 commit
  4. 17 Mar, 2023 9 commits
  5. 16 Mar, 2023 7 commits
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.8 · acf46b7b
      Marko Mäkelä authored
      acf46b7b
    • Marko Mäkelä's avatar
      MDEV-26827 Make page flushing even faster · a55b951e
      Marko Mäkelä authored
      For more convenient monitoring of something that could greatly affect
      the volume of page writes, we add the status variable
      Innodb_buffer_pool_pages_split that was previously only available
      via information_schema.innodb_metrics as "innodb_page_splits".
      This was suggested by Axel Schwenke.
      
      buf_flush_page_count: Replaced with buf_pool.stat.n_pages_written.
      We protect buf_pool.stat (except n_page_gets) with buf_pool.mutex
      and remove unnecessary export_vars indirection.
      
      buf_pool.flush_list_bytes: Moved from buf_pool.stat.flush_list_bytes.
      Protected by buf_pool.flush_list_mutex.
      
      buf_pool_t::page_cleaner_status: Replaces buf_pool_t::n_flush_LRU_,
      buf_pool_t::n_flush_list_, and buf_pool_t::page_cleaner_is_idle.
      Protected by buf_pool.flush_list_mutex. We will exclusively broadcast
      buf_pool.done_flush_list by the buf_flush_page_cleaner thread,
      and only wait for it when communicating with buf_flush_page_cleaner.
      There is no need to keep a count of pending writes by the
      buf_pool.flush_list processing. A single flag suffices for that.
      
      Waits for page write completion can be performed by
      simply waiting on block->page.lock, or by invoking
      buf_dblwr.wait_for_page_writes().
      
      buf_LRU_block_free_non_file_page(): Broadcast buf_pool.done_free and
      set buf_pool.try_LRU_scan when freeing a page. This would be
      executed also as part of buf_page_write_complete().
      
      buf_page_write_complete(): Do not broadcast buf_pool.done_flush_list,
      and do not acquire buf_pool.mutex unless buf_pool.LRU eviction is needed.
      Let buf_dblwr count all writes to persistent pages and broadcast a
      condition variable when no outstanding writes remain.
      
      buf_flush_page_cleaner(): Prioritize LRU flushing and eviction right after
      "furious flushing" (lsn_limit). Simplify the conditions and reduce the
      hold time of buf_pool.flush_list_mutex. Refuse to shut down
      or sleep if buf_pool.ran_out(), that is, LRU eviction is needed.
      
      buf_pool_t::page_cleaner_wakeup(): Add the optional parameter for_LRU.
      
      buf_LRU_get_free_block(): Protect buf_lru_free_blocks_error_printed
      with buf_pool.mutex. Invoke buf_pool.page_cleaner_wakeup(true) to
      to ensure that buf_flush_page_cleaner() will process the LRU flush
      request.
      
      buf_do_LRU_batch(), buf_flush_list(), buf_flush_list_space():
      Update buf_pool.stat.n_pages_written when submitting writes
      (while holding buf_pool.mutex), not when completing them.
      
      buf_page_t::flush(), buf_flush_discard_page(): Require that
      the page U-latch be acquired upfront, and remove
      buf_page_t::ready_for_flush().
      
      buf_pool_t::delete_from_flush_list(): Remove the parameter "bool clear".
      
      buf_flush_page(): Count pending page writes via buf_dblwr.
      
      buf_flush_try_neighbors(): Take the block of page_id as a parameter.
      If the tablespace is dropped before our page has been written out,
      release the page U-latch.
      
      buf_pool_invalidate(): Let the caller ensure that there are no
      outstanding writes.
      
      buf_flush_wait_batch_end(false),
      buf_flush_wait_batch_end_acquiring_mutex(false):
      Replaced with buf_dblwr.wait_for_page_writes().
      
      buf_flush_wait_LRU_batch_end(): Replaces buf_flush_wait_batch_end(true).
      
      buf_flush_list(): Remove some broadcast of buf_pool.done_flush_list.
      
      buf_flush_buffer_pool(): Invoke also buf_dblwr.wait_for_page_writes().
      
      buf_pool_t::io_pending(), buf_pool_t::n_flush_list(): Remove.
      Outstanding writes are reflected by buf_dblwr.pending_writes().
      
      buf_dblwr_t::init(): New function, to initialize the mutex and
      the condition variables, but not the backing store.
      
      buf_dblwr_t::is_created(): Replaces buf_dblwr_t::is_initialised().
      
      buf_dblwr_t::pending_writes(), buf_dblwr_t::writes_pending:
      Keeps track of writes of persistent data pages.
      
      buf_flush_LRU(): Allow calls while LRU flushing may be in progress
      in another thread.
      
      Tested by Matthias Leich (correctness) and Axel Schwenke (performance)
      a55b951e
    • Marko Mäkelä's avatar
      MDEV-26055: Improve adaptive flushing · 9593cccf
      Marko Mäkelä authored
      Adaptive flushing is enabled by setting innodb_max_dirty_pages_pct_lwm>0
      (not default) and innodb_adaptive_flushing=ON (default).
      There is also the parameter innodb_adaptive_flushing_lwm
      (default: 10 per cent of the log capacity). It should enable some
      adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0.
      That is not being changed here.
      
      This idea was first presented by Inaam Rana several years ago,
      and I discussed it with Jean-François Gagné at FOSDEM 2023.
      
      buf_flush_page_cleaner(): When we are not near the log capacity limit
      (neither buf_flush_async_lsn nor buf_flush_sync_lsn are set),
      also try to move clean blocks from the buf_pool.LRU list to buf_pool.free
      or initiate writes (but not the eviction) of dirty blocks, until
      the remaining I/O capacity has been consumed.
      
      buf_flush_LRU_list_batch(): Add the parameter bool evict, to specify
      whether dirty least recently used pages (from buf_pool.LRU) should
      be evicted immediately after they have been written out. Callers outside
      buf_flush_page_cleaner() will pass evict=true, to retain the existing
      behaviour.
      
      buf_do_LRU_batch(): Add the parameter bool evict.
      Return counts of evicted and flushed pages.
      
      buf_flush_LRU(): Add the parameter bool evict.
      Assume that the caller holds buf_pool.mutex and
      will invoke buf_dblwr.flush_buffered_writes() afterwards.
      
      buf_flush_list_holding_mutex(): A low-level variant of buf_flush_list()
      whose caller must hold buf_pool.mutex and invoke
      buf_dblwr.flush_buffered_writes() afterwards.
      
      buf_flush_wait_batch_end_acquiring_mutex(): Remove. It is enough to have
      buf_flush_wait_batch_end().
      
      page_cleaner_flush_pages_recommendation(): Avoid some floating-point
      arithmetics.
      
      buf_flush_page(), buf_flush_check_neighbor(), buf_flush_check_neighbors(),
      buf_flush_try_neighbors(): Rename the parameter "bool lru" to "bool evict".
      
      buf_free_from_unzip_LRU_list_batch(): Remove the parameter.
      Only actual page writes will contribute towards the limit.
      
      buf_LRU_free_page(): Evict freed pages of temporary tables.
      
      buf_pool.done_free: Broadcast whenever a block is freed
      (and buf_pool.try_LRU_scan is set).
      
      buf_pool_t::io_buf_t::reserve(): Retry indefinitely.
      During the test encryption.innochecksum we easily run out of
      these buffers for PAGE_COMPRESSED or ENCRYPTED pages.
      
      Tested by Matthias Leich and Axel Schwenke
      9593cccf
    • Marko Mäkelä's avatar
      MDEV-30357 Performance regression in locking reads from secondary indexes · 4105017a
      Marko Mäkelä authored
      lock_sec_rec_some_has_impl(): Remove a harmful condition that caused the
      performance regression and should not have been added in
      commit b6e41e38 in the first place.
      Locking transactions that have not modified any persistent tables
      can carry the transaction identifier 0.
      
      trx_t::max_inactive_id: A cache for trx_sys_t::find_same_or_older().
      The value is not reset on transaction commit so that previous results
      can be reused for subsequent transactions. The smallest active
      transaction ID can only increase over time, not decrease.
      
      trx_sys_t::find_same_or_older(): Remember the maximum previous id for which
      rw_trx_hash.iterate() returned false, to avoid redundant iterations.
      
      lock_sec_rec_read_check_and_lock(): Add an early return in case we are
      already holding a covering table lock.
      
      lock_rec_convert_impl_to_expl(): Add a template parameter to avoid
      a redundant run-time check on whether the index is secondary.
      
      lock_rec_convert_impl_to_expl_for_trx(): Move some code from
      lock_rec_convert_impl_to_expl(), to reduce code duplication due
      to the added template parameter.
      
      Reviewed by: Vladislav Lesin
      Tested by: Matthias Leich
      4105017a
    • Marko Mäkelä's avatar
      MDEV-29835 InnoDB hang on B-tree split or merge · f2096478
      Marko Mäkelä authored
      This is a follow-up to
      commit de4030e4 (MDEV-30400),
      which fixed some hangs related to B-tree split or merge.
      
      btr_root_block_get(): Use and update the root page guess. This is just
      a minor performance optimization, not affecting correctness.
      
      btr_validate_level(): Remove the parameter "lockout", and always
      acquire an exclusive dict_index_t::lock in CHECK TABLE without QUICK.
      This is needed in order to avoid latching order violation in
      btr_page_get_father_node_ptr_for_validate().
      
      btr_cur_need_opposite_intention(): Return true in case
      btr_cur_compress_recommendation() would hold later during the
      mini-transaction, or if a page underflow or overflow is possible.
      If we return true, our caller will escalate to aqcuiring an exclusive
      dict_index_t::lock, to prevent a latching order violation and deadlock
      during btr_compress() or btr_page_split_and_insert().
      
      btr_cur_t::search_leaf(), btr_cur_t::open_leaf():
      Also invoke btr_cur_need_opposite_intention() on the leaf page.
      
      btr_cur_t::open_leaf(): When escalating to exclusive index locking,
      acquire exclusive latches on all pages as well.
      
      innobase_instant_try(): Return an error code if the root page cannot
      be retrieved.
      
      In addition to the normal stress testing with Random Query Generator (RQG)
      this has been tested with
      ./mtr --mysqld=--loose-innodb-limit-optimistic-insert-debug=2
      but with the injection in btr_cur_optimistic_insert() for non-leaf pages
      adjusted so that it would use the value 3. (Otherwise, infinite page
      splits could occur in some mtr tests.)
      
      Tested by: Matthias Leich
      f2096478
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 85cbfaef
      Marko Mäkelä authored
      85cbfaef
    • Marko Mäkelä's avatar
      MDEV-30860 Race condition between buffer pool flush and log file deletion in... · 1495f057
      Marko Mäkelä authored
      MDEV-30860 Race condition between buffer pool flush and log file deletion in mariadb-backup --prepare
      
      srv_start(): If we are going to close the log file in
      mariadb-backup --prepare, call buf_flush_sync() before
      calling recv_sys.debug_free() to ensure that the log file
      will not be accessed.
      
      This fixes a rather rare failure in the test
      mariabackup.innodb_force_recovery where buf_flush_page_cleaner()
      would invoke log_checkpoint_low() because !recv_recovery_is_on()
      would hold due to the fact that recv_sys.debug_free() had
      already been called. Then, the log write for the checkpoint
      would fail because srv_start() had invoked log_sys.log.close_file().
      1495f057
  6. 14 Mar, 2023 2 commits
    • Igor Babaev's avatar
      MDEV-28958 Crash when checking whether condition can be pushed into view · e97560ea
      Igor Babaev authored
      Do not set any flags in the items for constant subformulas TRUE/FALSE when
      checking pushability of a formula into a view. Occurrences of these
      subformulas can be ignored when checking pushability of the formula.
      At the same time the items used for these constants became immutable
      starting from version 10.7.
      
      Approved by Oleksandr Byelkin <sanja@mariadb.com>
      e97560ea
    • Alexander Barkov's avatar
      MDEV-30805 SIGSEGV in my_convert and UBSAN: member access within null pointer... · 47036387
      Alexander Barkov authored
      MDEV-30805 SIGSEGV in my_convert and UBSAN: member access within null pointer of type 'const struct MY_CHARSET_HANDLER' in my_convert
      
      Type_handler::partition_field_append_value() erroneously
      passed the address of my_collation_contextually_typed_binary
      to conversion functions copy_and_convert() and my_convert().
      
      This happened because generate_partition_syntax_for_frm()
      was called from mysql_create_frm_image() in the stage when
      the fields in List<Create_field> can still contain unresolved
      contextual collations, like "binary" in the reported crash scenario:
      
        ALTER TABLE t CHANGE COLUMN a a CHAR BINARY;
      
      Fix:
      
      1. Splitting mysql_prepare_create_table() into two parts:
         - mysql_prepare_create_table_stage1() interates through
           List<Create_field> and calls Create_field::prepare_stage1(),
           which performs basic attribute initialization, including
           context collation resolution.
         - mysql_prepare_create_table_finalize() - the rest of the
           old mysql_prepare_create_table() code.
      
      2. Changing mysql_create_frm_image():
         It now calls:
         - mysql_prepare_create_table_stage1() in the very
           beginning, before the partition related code.
         - mysql_prepare_create_table_finalize() in the end,
          instead of the old mysql_prepare_create_table() call
      
      3. Adding mysql_prepare_create_table() as a wrapper
         for two calls:
           mysql_prepare_create_table_stage1() ||
           mysql_prepare_create_table_finalize()
         so the code stays unchanged in the other places
         where mysql_prepare_create_table() was used.
      
      4. Changing prototype for Type_handler::Column_definition_prepare_stage1()
         Removing arguments:
         - handler *file
         - ulonglong table_flags
         Adding a new argument instead:
         - column_definition_type_t type
         This allows to call Column_definition_prepare_stage1() and
         therefore to call mysql_prepare_create_table_stage1()
         before instantiation of a handler.
         This simplifies the code, because in case of a partitioned table,
         mysql_create_frm_image() creates a handler of the underlying
         partition first, the frees it and created a ha_partition
         instance instead.
         mysql_prepare_create_table() before the fix was called with the final
         (ha_partition) handler.
      
      5. Moving parts of Column_definition_prepare_stage1() which
         need a pointer to handler and table_flags to
         Column_definition_prepare_stage2().
      47036387
  7. 10 Mar, 2023 3 commits
    • Vlad Lesin's avatar
      MDEV-30775 Performance regression in fil_space_t::try_to_close() introduced in MDEV-23855 · 7d6b3d40
      Vlad Lesin authored
      fil_node_open_file_low() tries to close files from the top of
      fil_system.space_list if the number of opened files is exceeded.
      
      It invokes fil_space_t::try_to_close(), which iterates the list searching
      for the first opened space. Then it just closes the space, leaving it in
      the same position in fil_system.space_list.
      
      On heavy files opening, like during 'SHOW TABLE STATUS ...' execution,
      if the number of opened files limit is reached,
      fil_space_t::try_to_close() iterates more and more closed spaces before
      reaching any opened space for each fil_node_open_file_low() call. What
      causes performance regression if the number of spaces is big enough.
      
      The fix is to keep opened spaces at the top of fil_system.space_list,
      and move closed files at the end of the list.
      
      For this purpose fil_space_t::space_list_last_opened pointer is
      introduced. It points to the last inserted opened space in
      fil_space_t::space_list. When space is opened, it's inserted to the
      position just after the pointer points to in fil_space_t::space_list to
      preserve the logic, inroduced in MDEV-23855. Any closed space is added
      to the end of fil_space_t::space_list.
      
      As opened spaces are located at the top of fil_space_t::space_list,
      fil_space_t::try_to_close() finds opened space faster.
      
      There can be the case when opened and closed spaces are mixed in
      fil_space_t::space_list if fil_system.freeze_space_list was set during
      fil_node_open_file_low() execution. But this should not cause any error,
      as fil_space_t::try_to_close() still iterates spaces in the list.
      
      There is no need in any test case for the fix, as it does not change any
      functionality, but just fixes performance regression.
      7d6b3d40
    • Monty's avatar
      Fixes to mysql_install_db · ceb0e7f9
      Monty authored
      - Change to use 'mariadbd' instead of 'mysqld' in help texts and other
        visible places.
      - Start binary 'mariadbd' instead of 'mysqld'. This will remove a warning
        in 11.0 when running mysql_install_db.
      - Use my_print_defaults --mariadbd instead of --mysqld
      - Use --skip-log-error if the user don't have access to log-error file.
        This it needed to allow mysql_install_db to work silenty for users that
        has not write access to /var/log.
      
      Other things:
      - Updated my_print_defaults to support --mariadbd
      ceb0e7f9
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · f169dfb4
      Marko Mäkelä authored
      f169dfb4
  8. 09 Mar, 2023 3 commits
    • Daniel Black's avatar
      MDEV-30810 errmsg-utf8.txt no longer uses charsets · b600671f
      Daniel Black authored
      Charset names in the 'languages' line are not used any more.
      
      Removing to avoid confusion.
      
      All messages in errmsg-utf8.txt are in utf8 now.
      
      Charset names should have been removed in MySQL-5.5 during: https://dev.mysql.com/worklog/task/?id=751
      
      Bump version number.
      b600671f
    • Marko Mäkelä's avatar
      MDEV-30819 InnoDB fails to start up after downgrading from MariaDB 11.0 · 08267ba0
      Marko Mäkelä authored
      While downgrades are not supported and misguided attempts at it could
      cause serious corruption especially after
      commit b07920b6
      it might be useful if InnoDB would start up even after an upgrade to
      MariaDB Server 11.0 or later had removed the change buffer.
      
      innodb_change_buffering_update(): Disallow anything else than
      innodb_change_buffering=none when the change buffer is corrupted.
      
      ibuf_init_at_db_start(): Mention a possible downgrade in the corruption
      error message. If innodb_change_buffering=none, ignore the error but do
      not initialize ibuf.index.
      
      ibuf_free_excess_pages(), ibuf_contract(), ibuf_merge_space(),
      ibuf_update_max_tablespace_id(), ibuf_delete_for_discarded_space(),
      ibuf_print(): Check for !ibuf.index.
      
      ibuf_check_bitmap_on_import(): Remove some unnecessary code.
      This function is only accessing change buffer bitmap pages in a
      data file that is not attached to the rest of the database.
      It is not accessing the change buffer tree itself, hence it does
      not need any additional mutex protection.
      
      This has been tested both by starting up MariaDB Server 10.8 on
      a 11.0 data directory, and by running ./mtr --big-test while
      ibuf_init_at_db_start() was tweaked to always fail.
      08267ba0
    • Sergei Golubchik's avatar
      Merge branch '10.10' into 10.11 · b4c7f5e6
      Sergei Golubchik authored
      b4c7f5e6
  9. 08 Mar, 2023 4 commits
  10. 07 Mar, 2023 3 commits
  11. 06 Mar, 2023 2 commits