1. 20 Nov, 2023 1 commit
  2. 18 Nov, 2023 1 commit
    • Marko Mäkelä's avatar
      MDEV-31953 madvise(..., MADV_FREE) is causing a performance regression · 23234835
      Marko Mäkelä authored
      buf_page_t::set_os_unused(): Remove the system call that had been added in
      commit 16c97187 and revised in
      commit c1fd082e for Microsoft Windows.
      
      buf_pool_t::garbage_collect(): A new function to collect any garbage
      from the InnoDB buffer pool that can be removed without writing any
      log or data files. This will also invoke madvise() for all of buf_pool.free.
      
      To trigger this the following MDEV is implemented:
      MDEV-24670 avoid OOM by linux kernel co-operative memory management
      
      To avoid frequent triggers that caused the MDEV-31953 regression, while
      still preserving the 10.11 functionality of non-greedy kernel memory
      usage, memory triggers are used.
      
      On the triggering of memory pressure, if supported in the Linux kernel,
      trigger the garbage collection of the innodb buffer pool.
      
      The hard coded triggers occur where there is:
      * some memory pressure in 5 of the last 10 seconds
      * a full stall on memory pressure for 10ms in the last 2 seconds
      
      The kernel will trigger only one in each of these time windows. To avoid
      mariadb being in a constant state of memory garbage collection, this has
      been limited to once per minute.
      
      For a small set of kernels in 2023 (6.5, 6.6), there was a limit requiring
      CAP_SYS_RESOURCE that was lifted[1] to support the use case of user
      memory pressure. It not currently possible to set CAP_SYS_RESOURCES in
      a systemd service as its setting a capability inside a usernamespace.
      
      Running under systemd v254+ requires the default MemoryPressureWatch=auto
      (or alternately "on").
      
      Functionality was tested in a 6.4 kernel Fedora successfully under a
      systemd service.
      
      Running in a container requires that (unmask=)/sys/fs/cgroup be writable
      by the mariadbd process.
      
      To aid testing, the buf_pool_resize was a convient trigger point on
      which to trigger garbage collection.
      
      ref [1]: https://lore.kernel.org/all/CAMw=ZnQ56cm4Txgy5EhGYvR+Jt4s-KVgoA9_65HKWVMOXp7a9A@mail.gmail.com/T/#m3bd2a73c5ee49965cb73a830b1ccaa37ccf4e427
      
      Co-Author: Daniel Black (on memory pressure trigger)
      
      Reviewed by: Marko Mäkelä, Vladislav Vaintroub, Vladislav Lesin,
         Thirunarayanan Balathandayuthapani
      
      Tested by: Matthias Leich
      23234835
  3. 17 Nov, 2023 2 commits
    • Marko Mäkelä's avatar
      MDEV-32027 Opening all .ibd files on InnoDB startup can be slow · eb1f8b29
      Marko Mäkelä authored
      dict_find_max_space_id(): Return SELECT MAX(SPACE) FROM SYS_TABLES.
      
      dict_check_tablespaces_and_store_max_id(): In the normal case
      (no encryption plugin has been loaded and the change buffer is empty),
      invoke dict_find_max_space_id() and do not open any .ibd files.
      If a std::set<uint32_t> has been specified, open the files whose
      tablespace ID is mentioned. Else, open all data files that are identified
      by SYS_TABLES records.
      
      fil_ibd_open(): Remove a call to os_file_get_last_error() that can
      report a misleading error, such as EINVAL inside my_realpath() that is
      not an actual error. This could be invoked when a data file is found
      but the FSP_SPACE_FLAGS are incorrect, such as is the case for
      table test.td in
      ./mtr --mysqld=--innodb-buffer-pool-dump-at-shutdown=0 innodb.table_flags
      
      buf_load(): If any tablespaces could not be found, invoke
      dict_check_tablespaces_and_store_max_id() on the missing tablespaces.
      
      dict_load_tablespace(): Try to load the tablespace unless it was found
      to be futile. This fixes failures related to FTS_*.ibd files for
      FULLTEXT INDEX.
      
      btr_cur_t::search_leaf(): Prevent a crash when the tablespace
      does not exist. This was caught by the test innodb_fts.fts_concurrent_insert
      when the change to dict_load_tablespaces() was not present.
      
      We modify a few tests to ensure that tables will not be loaded at startup.
      For some fault injection tests this means that the corrupted tables
      will not be loaded, because dict_load_tablespace() would perform stricter
      checks than dict_check_tablespaces_and_store_max_id().
      
      Tested by: Matthias Leich
      Reviewed by: Thirunarayanan Balathandayuthapani
      eb1f8b29
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 44b9e416
      Marko Mäkelä authored
      44b9e416
  4. 16 Nov, 2023 5 commits
    • Marko Mäkelä's avatar
      MDEV-26055: Correct the formula for adaptive flushing · 9a545eb6
      Marko Mäkelä authored
      This is a 10.5 backport of 10.6
      commit d4265fbd.
      
      page_cleaner_flush_pages_recommendation(): If dirty_pct is
      between innodb_max_dirty_pages_pct_lwm
      and innodb_max_dirty_pages_pct,
      scale the effort relative to how close we are to
      innodb_max_dirty_pages_pct.
      
      The previous formula was missing a multiplication by 100.
      9a545eb6
    • Marko Mäkelä's avatar
      MDEV-26055: Improve adaptive flushing · a3d0d5fc
      Marko Mäkelä authored
      This is a 10.5 backport from 10.6
      commit 9593cccf.
      
      Adaptive flushing is enabled by setting innodb_max_dirty_pages_pct_lwm>0
      (not default) and innodb_adaptive_flushing=ON (default).
      There is also the parameter innodb_adaptive_flushing_lwm
      (default: 10 per cent of the log capacity). It should enable some
      adaptive flushing even when innodb_max_dirty_pages_pct_lwm=0.
      That is not being changed here.
      
      This idea was first presented by Inaam Rana several years ago,
      and I discussed it with Jean-François Gagné at FOSDEM 2023.
      
      buf_flush_page_cleaner(): When we are not near the log capacity limit
      (neither buf_flush_async_lsn nor buf_flush_sync_lsn are set),
      also try to move clean blocks from the buf_pool.LRU list to buf_pool.free
      or initiate writes (but not the eviction) of dirty blocks, until
      the remaining I/O capacity has been consumed.
      
      buf_flush_LRU_list_batch(): Add the parameter bool evict, to specify
      whether dirty least recently used pages (from buf_pool.LRU) should
      be evicted immediately after they have been written out. Callers outside
      buf_flush_page_cleaner() will pass evict=true, to retain the existing
      behaviour.
      
      buf_do_LRU_batch(): Add the parameter bool evict.
      Return counts of evicted and flushed pages.
      
      buf_flush_LRU(): Add the parameter bool evict.
      Assume that the caller holds buf_pool.mutex and
      will invoke buf_dblwr.flush_buffered_writes() afterwards.
      
      buf_flush_list_holding_mutex(): A low-level variant of buf_flush_list()
      whose caller must hold buf_pool.mutex and invoke
      buf_dblwr.flush_buffered_writes() afterwards.
      
      buf_flush_wait_batch_end_acquiring_mutex(): Remove. It is enough to have
      buf_flush_wait_batch_end().
      
      page_cleaner_flush_pages_recommendation(): Avoid some floating-point
      arithmetics.
      
      buf_flush_page(), buf_flush_check_neighbor(), buf_flush_check_neighbors(),
      buf_flush_try_neighbors(): Rename the parameter "bool lru" to "bool evict".
      
      buf_free_from_unzip_LRU_list_batch(): Remove the parameter.
      Only actual page writes will contribute towards the limit.
      
      buf_LRU_free_page(): Evict freed pages of temporary tables.
      
      buf_pool.done_free: Broadcast whenever a block is freed
      (and buf_pool.try_LRU_scan is set).
      
      buf_pool_t::io_buf_t::reserve(): Retry indefinitely.
      During the test encryption.innochecksum we easily run out of
      these buffers for PAGE_COMPRESSED or ENCRYPTED pages.
      
      Tested by Matthias Leich and Axel Schwenke
      a3d0d5fc
    • Marko Mäkelä's avatar
      MDEV-31861 Empty INSERT crashes with innodb_force_recovery=6 or innodb_read_only=ON · 5a1f821b
      Marko Mäkelä authored
      ha_innobase::extra(): Do not invoke log_buffer_flush_to_disk()
      if high_level_read_only holds.
      
      log_buffer_flush_to_disk(): Remove an assertion that duplicates one
      at the start of log_write_up_to().
      5a1f821b
    • Marko Mäkelä's avatar
      MDEV-32050 fixup: innodb.instant_alter_crash · 55a96c05
      Marko Mäkelä authored
      This test occasionally fails with a failure to purge history.
      Let us try to purge everything before starting the interesting part,
      to make that occasional failure go away.
      55a96c05
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-32811 Potentially broken crash recovery if a mini-transaction frees a... · 6c342459
      Thirunarayanan Balathandayuthapani authored
      MDEV-32811 Potentially broken crash recovery if a mini-transaction frees a page, not modifying previously clean pages
      
      - The 11.2 test innodb.sys_truncate_debug fails while executing insert statement.
      Reason for the failure is that same mini-transaction does freeing, allocating
      and freeing the same page. Page initialization clears the FIL_PAGE_LSN
      on the page, fails to set the FIL_PAGE_LSN after freeing the same page.
      This issue is caused by commit f46efb44
      
      mtr_t::commit(): Should set the FIL_PAGE_LSN even though page is freed
      6c342459
  5. 15 Nov, 2023 6 commits
    • Rex's avatar
      Merge 10.4 into 10.5 · 8b509a5d
      Rex authored
      8b509a5d
    • Marko Mäkelä's avatar
      MDEV-32757: rollback crash on corruption · ea6ca013
      Marko Mäkelä authored
      trx_undo_free_page(): Detect a case of corrupted TRX_UNDO_PAGE_LIST.
      
      trx_undo_truncate_end(): Stop attempts to truncate a corrupted log.
      
      trx_t::commit_empty(): Add an error message of a corrupted log.
      
      Reviewed by: Thirunarayanan Balathandayuthapani
      ea6ca013
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 5dbe7a8c
      Marko Mäkelä authored
      5dbe7a8c
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 52ca2e65
      Marko Mäkelä authored
      52ca2e65
    • Marko Mäkelä's avatar
      MDEV-32757 innodb_undo_log_truncate=ON is not crash safe · a0f02f74
      Marko Mäkelä authored
      trx_purge_truncate_history(): Do not prematurely mark dirty pages
      as clean. This will be done in mtr_t::commit_shrink() as part of
      Shrink::operator()(mtr_memo_slot_t*). Also, register each dirty page
      only once in the mini-transaction.
      
      fsp_page_create(): Adjust and simplify the page creation during
      undo tablespace truncation. We can directly reuse pages that are
      already in buf_pool.page_hash.
      
      This fixes a regression that was caused by
      commit f5794e1d (MDEV-26445).
      
      Tested by: Matthias Leich
      Reviewed by: Thirunarayanan Balathandayuthapani
      a0f02f74
    • Tuukka Pasanen's avatar
      MDEV-32689: Remove Ubuntu Bionic from 10.5 · 15bb8acf
      Tuukka Pasanen authored
      Commit Removed Ubuntu Bionic from
      debian/autobake-debs.sh as it's not used
      anymore to build official MariaDB images
      
      REMINDER TO MERGER: This commit should not
      be merged up to 10.6 or forward
      15bb8acf
  6. 14 Nov, 2023 6 commits
  7. 13 Nov, 2023 4 commits
  8. 11 Nov, 2023 2 commits
  9. 10 Nov, 2023 1 commit
  10. 09 Nov, 2023 1 commit
    • Marko Mäkelä's avatar
      MDEV-32737 innodb.log_file_name fails on Assertion `after_apply ||... · e0c65784
      Marko Mäkelä authored
      MDEV-32737 innodb.log_file_name fails on Assertion `after_apply || !(blocks).end in recv_sys_t::clear
      
      recv_group_scan_log_recs(): Set the debug flag recv_sys.after_apply
      after actually completing the log scan.
      
      In the test, suppress some errors that may be reported when
      the crash recovery of RENAME TABLE t1 TO t2 is preceded by
      copying t2.ibd to t1.ibd.
      e0c65784
  11. 08 Nov, 2023 9 commits
  12. 07 Nov, 2023 1 commit
  13. 06 Nov, 2023 1 commit