1. 14 Feb, 2022 7 commits
    • Vlad Lesin's avatar
      MDEV-20605 Awaken transaction can miss inserted by other transaction records... · 20e9e804
      Vlad Lesin authored
      MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
      
      sel_restore_position_for_mysql() moves forward persistent cursor
      position after btr_pcur_restore_position() call if cursor relative position
      is BTR_PCUR_ON and the cursor points to the record with NOT the same field
      values as in a stored record(and some other not important for this case
      conditions).
      
      It was done because btr_pcur_restore_position() sets
      page_cur_mode_t mode  to PAGE_CUR_LE for cursor->rel_pos ==  BTR_PCUR_ON
      before opening cursor. So we are searching for the record less or equal
      to stored one. And if the found record is not equal to stored one, then
      it is less and we need to move cursor forward.
      
      But there can be a situation when the stored record was purged, but the
      new one with the same key but different value was inserted while
      row_search_mvcc() was suspended. In this case, when the thread is
      awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
      invoke btr_pcur_restore_position(), which will return false because found
      record don't match stored record, and
      sel_restore_position_for_mysql() will move forward cursor position.
      
      The above can lead to the case when awaken row_search_mvcc() do not see
      records inserted by other transactions while it slept. The mtr test case
      shows the example how it can be.
      
      The fix is to return special value from persistent cursor restoring
      function which would notify its caller that uniq fields of restored
      record and stored record are the same, and in this case
      sel_restore_position_for_mysql() don't move cursor forward.
      
      Delete-marked records are correctly processed in row_search_mvcc().
      Non-unique secondary indexes are "uniquified" by adding the PK, the
      index->n_uniq should then be index->n_fields. So there is no need in
      additional checks in the fix.
      
      If transaction's readview can't see the changes made in secondary index
      record, it requests clustered index record in row_search_mvcc() to check
      its transaction id and get the correspondent record version. After this
      row_search_mvcc() commits mtr to preserve clustered index latching
      order, and starts mtr. Between those mtr commit and start secondary
      index pages are unlatched, and purge has the ability to remove stored in
      the cursor record, what causes rows duplication in result set for
      non-locking reads, as cursor position is restored to the previously
      visited record.
      
      To solve this the changes are just switched off for non-locking reads,
      it's quite simple solution, besides the changes don't make sense for
      non-locking reads.
      
      The more complex and effective from performance perspective solution is
      to create mtr savepoint before clustered record requesting and rolling
      back to that savepoint after that. See MDEV-27557.
      
      One more solution is to have per-record transaction id for secondary
      indexes. See MDEV-17598.
      
      If any of those is implemented, just remove select_lock_type argument in
      sel_restore_position_for_mysql().
      20e9e804
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 52b32c60
      Marko Mäkelä authored
      52b32c60
    • Marko Mäkelä's avatar
      Merge mariadb-10.5.15 into 10.5 · 6405ed63
      Marko Mäkelä authored
      6405ed63
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · c9bc10e6
      Marko Mäkelä authored
      c9bc10e6
    • Marko Mäkelä's avatar
      Merge mariadb-10.4.24 into 10.4 · 4964f181
      Marko Mäkelä authored
      4964f181
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · e928fdbf
      Marko Mäkelä authored
      e928fdbf
    • Marko Mäkelä's avatar
      Merge mariadb-10.3.34 into 10.3 · a6ef239b
      Marko Mäkelä authored
      a6ef239b
  2. 12 Feb, 2022 4 commits
  3. 11 Feb, 2022 4 commits
  4. 10 Feb, 2022 10 commits
  5. 09 Feb, 2022 6 commits
    • Marko Mäkelä's avatar
      MDEV-27716 mtr_t::commit() acquires log_sys.mutex when writing no log · fd101daa
      Marko Mäkelä authored
      mtr_t::is_block_dirtied(), mtr_t::memo_push(): Never set m_made_dirty
      for pages of the temporary tablespace. Ever since
      commit 5eb53955
      we never add those pages to buf_pool.flush_list.
      
      mtr_t::commit(): Implement part of mtr_t::prepare_write() here,
      and avoid acquiring log_sys.mutex if no log is written.
      During IMPORT TABLESPACE fixup, we do not write log, but we must
      add pages to buf_pool.flush_list and for that, be prepared
      to acquire log_sys.flush_order_mutex.
      
      mtr_t::do_write(): Replaces mtr_t::prepare_write().
      fd101daa
    • Oleksandr Byelkin's avatar
      34c50196
    • Oleksandr Byelkin's avatar
      8a7776a8
    • Oleksandr Byelkin's avatar
      e3524445
    • Oleksandr Byelkin's avatar
      941bc705
    • Marko Mäkelä's avatar
      MDEV-27734 Set innodb_change_buffering=none by default · 5c46751f
      Marko Mäkelä authored
      The aim of the InnoDB change buffer is to avoid delays when a leaf page
      of a secondary index is not present in the buffer pool, and a record needs
      to be inserted, delete-marked, or purged. Instead of reading the page into
      the buffer pool for making such a modification, we may insert a record to
      the change buffer (a special index tree in the InnoDB system tablespace).
      The buffered changes are guaranteed to be merged if the index page
      actually needs to be read later.
      
      The change buffer could be useful when the database is stored on a
      rotational medium (hard disk) where random seeks are slower than
      sequential reads or writes.
      
      Obviously, the change buffer will cause write amplification, due to
      potentially large amount of metadata that is being written to the
      change buffer. We will have to write redo log records for modifying
      the change buffer tree as well as the user tablespace. Furthermore,
      in the user tablespace, we must maintain a change buffer bitmap page
      that uses 2 bits for estimating the amount of free space in pages,
      and 1 bit to specify whether buffered changes exist. This bitmap needs
      to be updated on every operation, which could reduce performance.
      
      Even if the change buffer were free of bugs such as MDEV-24449
      (potentially causing the corruption of any page in the system tablespace)
      or MDEV-26977 (corruption of secondary indexes due to a currently
      unknown reason), it will make diagnosis of other data corruption harder.
      
      Because of all this, it is best to disable the change buffer by default.
      5c46751f
  6. 08 Feb, 2022 9 commits
    • Daniel Bartholomew's avatar
      bump the VERSION · f7704d74
      Daniel Bartholomew authored
      f7704d74
    • Daniel Bartholomew's avatar
      bump the VERSION · 2f07b21c
      Daniel Bartholomew authored
      2f07b21c
    • Daniel Bartholomew's avatar
      bump the VERSION · 30cc63fa
      Daniel Bartholomew authored
      30cc63fa
    • Daniel Bartholomew's avatar
      bump the VERSION · c0a44ff7
      Daniel Bartholomew authored
      c0a44ff7
    • Monty's avatar
      MDEV-26585 Wrong query results when `using index for group-by` · 38058c04
      Monty authored
      The problem was that "group_min_max optimization" does not work if
      some aggregate functions, like COUNT(*), is used.
      The function get_best_group_min_max() is using the join->sum_funcs
      array to check which aggregate functions are used.
      The bug was that aggregates in HAVING where not yet added to
      join->sum_funcs at the time get_best_group_min_max() was called.
      
      Fixed by populate join->sum_funcs already in prepare, which means that
      all sum functions will be in join->sum_funcs in get_best_group_min_max().
      A benefit of this approach is that we can remove several calls to
      make_sum_func_list() from the code and simplify the function.
      
      I removed some wrong setting of 'sort_and_group'.
      This variable is set when alloc_group_fields() is called, as part
      of allocating the cache needed by end_send_group() and does not need
      to be set by other functions.
      
      One problematic thing was that Spider is using *join->sum_funcs to detect
      at which stage the optimizer is and do internal calculations of aggregate
      functions. Updating join->sum_funcs early caused Spider to fail when trying
      to find min/max values in opt_sum_query().
      Fixed by temporarily resetting sum_funcs during opt_sum_query().
      
      Reviewer: Sergei Petrunia
      38058c04
    • Monty's avatar
      MDEV-27442 Wrong result upon query with DISTINCT and EXISTS subquery · d314bd26
      Monty authored
      The problem was that get_best_group_min_max() did not check if fields used
      by the "group_min_max optimization" where used in sub queries.
      Because of this, it did not detect that a key (b,a) was used in the WHERE
      clause for the statement:
      SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 ).
      
      Fixed by also traversing the sub queries when checking if a field is used.
      This disables group_min_max_optimization for the above query.
      
      Reviewer: Sergei Petrunia
      d314bd26
    • Monty's avatar
      MENT-328 Retry BACKUP STAGE BLOCK DDL in case of deadlocks · a1c23807
      Monty authored
      MENT-328 wrongly assumed that the backup failed because of warnings from
      mariabackup about not found files. This is normal (and the error message
      should be deleted).
      
      randgen failed because mariabackup didn't retry BACKUP STAGE BLOCK DDL
      if it failed with a deadlock.
      
      To simplify things, I implemented the retry loop in the server as
      this particular deadlock should be quickly resolved.
      a1c23807
    • Monty's avatar
      0ec27d7b
    • Monty's avatar
      Fixes some compiler issues on AIX ( · 88fb89ac
      Monty authored
      88fb89ac