1. 09 Feb, 2022 8 commits
    • Sergei Golubchik's avatar
      support lzma < 5.1.3alpha · 9bd7e526
      Sergei Golubchik authored
      where `lzma_allocator *allocator` isn't declared const
      9bd7e526
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.7 · 70a88755
      Marko Mäkelä authored
      70a88755
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · cce99405
      Marko Mäkelä authored
      cce99405
    • Marko Mäkelä's avatar
      MDEV-27716 mtr_t::commit() acquires log_sys.mutex when writing no log · fd101daa
      Marko Mäkelä authored
      mtr_t::is_block_dirtied(), mtr_t::memo_push(): Never set m_made_dirty
      for pages of the temporary tablespace. Ever since
      commit 5eb53955
      we never add those pages to buf_pool.flush_list.
      
      mtr_t::commit(): Implement part of mtr_t::prepare_write() here,
      and avoid acquiring log_sys.mutex if no log is written.
      During IMPORT TABLESPACE fixup, we do not write log, but we must
      add pages to buf_pool.flush_list and for that, be prepared
      to acquire log_sys.flush_order_mutex.
      
      mtr_t::do_write(): Replaces mtr_t::prepare_write().
      fd101daa
    • Oleksandr Byelkin's avatar
      bbd4837f
    • Oleksandr Byelkin's avatar
      1bed5640
    • Oleksandr Byelkin's avatar
      34c50196
    • Marko Mäkelä's avatar
      MDEV-27734 Set innodb_change_buffering=none by default · 5c46751f
      Marko Mäkelä authored
      The aim of the InnoDB change buffer is to avoid delays when a leaf page
      of a secondary index is not present in the buffer pool, and a record needs
      to be inserted, delete-marked, or purged. Instead of reading the page into
      the buffer pool for making such a modification, we may insert a record to
      the change buffer (a special index tree in the InnoDB system tablespace).
      The buffered changes are guaranteed to be merged if the index page
      actually needs to be read later.
      
      The change buffer could be useful when the database is stored on a
      rotational medium (hard disk) where random seeks are slower than
      sequential reads or writes.
      
      Obviously, the change buffer will cause write amplification, due to
      potentially large amount of metadata that is being written to the
      change buffer. We will have to write redo log records for modifying
      the change buffer tree as well as the user tablespace. Furthermore,
      in the user tablespace, we must maintain a change buffer bitmap page
      that uses 2 bits for estimating the amount of free space in pages,
      and 1 bit to specify whether buffered changes exist. This bitmap needs
      to be updated on every operation, which could reduce performance.
      
      Even if the change buffer were free of bugs such as MDEV-24449
      (potentially causing the corruption of any page in the system tablespace)
      or MDEV-26977 (corruption of secondary indexes due to a currently
      unknown reason), it will make diagnosis of other data corruption harder.
      
      Because of all this, it is best to disable the change buffer by default.
      5c46751f
  2. 08 Feb, 2022 9 commits
    • Daniel Bartholomew's avatar
      bump the VERSION · 9055db2f
      Daniel Bartholomew authored
      9055db2f
    • Daniel Bartholomew's avatar
      bump the VERSION · fa73117b
      Daniel Bartholomew authored
      fa73117b
    • Daniel Bartholomew's avatar
      bump the VERSION · f7704d74
      Daniel Bartholomew authored
      f7704d74
    • Monty's avatar
      MDEV-26585 Wrong query results when `using index for group-by` · 38058c04
      Monty authored
      The problem was that "group_min_max optimization" does not work if
      some aggregate functions, like COUNT(*), is used.
      The function get_best_group_min_max() is using the join->sum_funcs
      array to check which aggregate functions are used.
      The bug was that aggregates in HAVING where not yet added to
      join->sum_funcs at the time get_best_group_min_max() was called.
      
      Fixed by populate join->sum_funcs already in prepare, which means that
      all sum functions will be in join->sum_funcs in get_best_group_min_max().
      A benefit of this approach is that we can remove several calls to
      make_sum_func_list() from the code and simplify the function.
      
      I removed some wrong setting of 'sort_and_group'.
      This variable is set when alloc_group_fields() is called, as part
      of allocating the cache needed by end_send_group() and does not need
      to be set by other functions.
      
      One problematic thing was that Spider is using *join->sum_funcs to detect
      at which stage the optimizer is and do internal calculations of aggregate
      functions. Updating join->sum_funcs early caused Spider to fail when trying
      to find min/max values in opt_sum_query().
      Fixed by temporarily resetting sum_funcs during opt_sum_query().
      
      Reviewer: Sergei Petrunia
      38058c04
    • Monty's avatar
      MDEV-27442 Wrong result upon query with DISTINCT and EXISTS subquery · d314bd26
      Monty authored
      The problem was that get_best_group_min_max() did not check if fields used
      by the "group_min_max optimization" where used in sub queries.
      Because of this, it did not detect that a key (b,a) was used in the WHERE
      clause for the statement:
      SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 ).
      
      Fixed by also traversing the sub queries when checking if a field is used.
      This disables group_min_max_optimization for the above query.
      
      Reviewer: Sergei Petrunia
      d314bd26
    • Monty's avatar
      MENT-328 Retry BACKUP STAGE BLOCK DDL in case of deadlocks · a1c23807
      Monty authored
      MENT-328 wrongly assumed that the backup failed because of warnings from
      mariabackup about not found files. This is normal (and the error message
      should be deleted).
      
      randgen failed because mariabackup didn't retry BACKUP STAGE BLOCK DDL
      if it failed with a deadlock.
      
      To simplify things, I implemented the retry loop in the server as
      this particular deadlock should be quickly resolved.
      a1c23807
    • Monty's avatar
      0ec27d7b
    • Monty's avatar
      Fixes some compiler issues on AIX ( · 88fb89ac
      Monty authored
      88fb89ac
    • Monty's avatar
      Fixed my_addr_resolve (cherry picked from 10.6) · df02de68
      Monty authored
      When a server is compiled with -fPIE, my_addr_resolve needs to
      subtract the info.dli_fbase from symbol addresses in memory for
      addr2line to recognize them.  When a server is compiled without -fPIE,
      my_addr_resolve should not do it.  Unfortunately not all compilers
      define __PIE__ when -fPIE was used (e.g. older gcc doesn't), so we
      have to resort to run-time detection.
      df02de68
  3. 07 Feb, 2022 1 commit
    • Vladislav Vaintroub's avatar
      MDEV-27754 : Assertion with innodb_flush_method=O_DSYNC · 881918bf
      Vladislav Vaintroub authored
      If innodb_flush_method=O_DSYNC, log_sys.flushed_to_disk_lsn  is changed
      without 'flush_lock' protection inside log_write().
      
      This leads to a race condition, if there are 2 threads running in parallel,
      doing log_write_up_to() with different values for 'flush_to_disk'
      
      In this case, log_write() and log_write_flush_to_disk_low() can execute at
      the same time, and both would change flushed_lsn.
      
      The fix is to remove special treatment of durable writes from log_write().
      There is no apparent reason for this special treatment, log_write_flush_to_disk_low()
      is already optimized for durable writes.
      
      Nor there is an apparent reason to call log_flush_notify() more often in
      for O_DSYNC.
      881918bf
  4. 06 Feb, 2022 3 commits
  5. 05 Feb, 2022 1 commit
  6. 04 Feb, 2022 6 commits
  7. 03 Feb, 2022 3 commits
    • Marko Mäkelä's avatar
      MDEV-27058 fixup: Crash in innodb.leaf_page_corrupted_during_recovery · 82f5981e
      Marko Mäkelä authored
      buf_page_get_low(): If the page was read-fixed, validate the page ID
      because the page could have been marked as corrupted. We should retry
      the page read in this case, instead of returning a soon-to-be-evicted
      corrupted page to the caller.
      
      This was initially only observed on Microsoft Windows.
      On Linux, this was repeated after adding a sleep
      to buf_pool_t::corrupted_evict() between
      bpage->zip.fix.fetch_sub() and bpage->lock.x_unlock().
      82f5981e
    • Marko Mäkelä's avatar
      MDEV-27736 Allow seamless upgrade despite ROW_FORMAT=COMPRESSED · 05c33d62
      Marko Mäkelä authored
      In commit 9bc874a5 (MDEV-23497)
      the configuration option innodb_read_only_compressed was introduced
      to giver users advance notice of a plan to remove ROW_FORMAT=COMPRESSED
      support for InnoDB.
      
      Based on user feedback, this plan has been scrapped.
      Even though ROW_FORMAT=COMPRESSED is a dead end and causes some
      overhead for InnoDB data structures, we can live with that.
      
      Now that we know that some users really want to keep using
      ROW_FORMAT=COMPRESSED, the previous default value of the parameter
      innodb_read_only_compressed=ON should be changed to OFF, to allow
      smooth upgrades to 10.6 and later versions, without requiring users
      to update any configuration file.
      05c33d62
    • Oleksandr Byelkin's avatar
      Merge branch '10.5' into 10.6 · f5c5f8e4
      Oleksandr Byelkin authored
      f5c5f8e4
  8. 01 Feb, 2022 2 commits
    • Oleksandr Byelkin's avatar
      Merge branch '10.4' into 10.5 · cf63eece
      Oleksandr Byelkin authored
      cf63eece
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-26326 mariabackup skip valid ibd file · 8d742fe4
      Thirunarayanan Balathandayuthapani authored
      - Store the deferred tablespace name while loading the tablespace
      for backup process.
      
      - Mariabackup stores the list of space ids which has page0 INIT_PAGE
      records. backup_first_page_op() and first_page_init() was introduced
      to track the page0 INIT_PAGE records.
      
      - backup_file_op() and log_file_op() was changed to handle
      FILE_MODIFY redo log records. It is used to identify the
      deferred tablespace space id.
      
      - Whenever file operation redo log was processed by backup,
      backup_file_op() should check whether the space name exist
      in deferred tablespace. If it is then it needs to store the
      space id, name when FILE_MODIFY, FILE_RENAME redo log processed
      and it should delete the tablespace name from defer list in other
      cases.
      
      - backup_fix_ddl() should check whether deferred tablespace has
      any page0 init records. If it is then consider the tablespace
      as newly created tablespace. If not then backup should try
      to reload the tablespace with SRV_BACKUP_NO_DEFER mode to
      avoid the deferring of tablespace.
      8d742fe4
  9. 31 Jan, 2022 7 commits