1. 25 Oct, 2023 6 commits
    • Marko Mäkelä's avatar
      MDEV-32050: Hold exclusive purge_sys.rseg->latch longer · 2027c482
      Marko Mäkelä authored
      Let the purge_coordinator_task acquire purge_sys.rseg->latch
      less frequently and hold it longer at a time. This may throttle
      concurrent DML and prevent purge lag a little.
      
      Remove an unnecessary std::this_thread::yield(), because the
      trx_purge_attach_undo_recs() is supposed to terminate the scan
      when running out of undo log records. Ultimately, this will
      result in purge_coordinator_state::do_purge() and
      purge_coordinator_callback() returning control to the thread pool.
      
      Reviewed by: Vladislav Lesin and Vladislav Vaintroub
      2027c482
    • Marko Mäkelä's avatar
      MDEV-32050: Improve srv_wake_purge_thread_if_not_active() · 44689eb7
      Marko Mäkelä authored
      purge_sys_t::wake_if_not_active(): Replaces
      srv_wake_purge_thread_if_not_active().
      
      innodb_ddl_recovery_done(): Move the wakeup call to
      srv_init_purge_tasks().
      
      purge_coordinator_timer: Remove. The srv_master_callback() already
      invokes purge_sys.wake_if_not_active() once per second.
      
      Reviewed by: Vladislav Lesin and Vladislav Vaintroub
      44689eb7
    • Marko Mäkelä's avatar
      MDEV-32050: Deprecate&ignore innodb_purge_rseg_truncate_frequency · 14685b10
      Marko Mäkelä authored
      The motivation of introducing the parameter
      innodb_purge_rseg_truncate_frequency in
      mysql/mysql-server@28bbd66ea5f6acf80fcb381057bb7ca5b7b188d2 and
      mysql/mysql-server@8fc2120fed11d2498ecb3635d87f414c76985fce
      seems to have been to avoid stalls due to freeing undo log pages
      or truncating undo log tablespaces. In MariaDB Server,
      innodb_undo_log_truncate=ON should be a much lighter operation
      than in MySQL, because it will not involve any log checkpoint.
      
      Another source of performance stalls should be
      trx_purge_truncate_rseg_history(), which is shrinking the history list
      by freeing the undo log pages whose undo records have been purged.
      To alleviate that, we will introduce a purge_truncation_task that will
      offload this from the purge_coordinator_task. In that way, the next
      innodb_purge_batch_size pages may be parsed and purged while the pages
      from the previous batch are being freed and the history list being shrunk.
      
      The processing of innodb_undo_log_truncate=ON will still remain the
      responsibility of the purge_coordinator_task.
      
      purge_coordinator_state::count: Remove. We will ignore
      innodb_purge_rseg_truncate_frequency, and act as if it had been
      set to 1 (the maximum shrinking frequency).
      
      purge_coordinator_state::do_purge(): Invoke an asynchronous task
      purge_truncation_callback() to free the undo log pages.
      
      purge_sys_t::iterator::free_history(): Free those undo log pages
      that have been processed. This used to be a part of
      trx_purge_truncate_history().
      
      purge_sys_t::clone_end_view(): Take a new value of purge_sys.head
      as a parameter, so that it will be updated while holding exclusive
      purge_sys.latch. This is needed for race-free access to the field
      in purge_truncation_callback().
      
      Reviewed by: Vladislav Lesin
      14685b10
    • Marko Mäkelä's avatar
      MDEV-32050: Clean up online ALTER · 21bec970
      Marko Mäkelä authored
      UndorecApplier::assign_rec(): Remove. We will pass the undo record to
      UndorecApplier::apply_undo_rec(). There is no need to copy the
      undo record, because nothing else can write to the undo log pages
      that belong to an active or incomplete transaction.
      
      trx_t::apply_log(): Buffer-fix the undo page across mini-transaction
      boundary in order to avoid repeated page lookups.
      
      Reviewed by: Vladislav Lesin
      21bec970
    • Marko Mäkelä's avatar
      MDEV-32050: Clean up log parsing · 9bb5d9fe
      Marko Mäkelä authored
      purge_node_t, undo_node_t: Change the type of rec_type and cmpl_info
      to byte, because this data is being extracted from a single byte.
      
      UndoRecApplier: Change type and cmpl_info to be of type byte, and
      move them next to the 16-bit offset field to minimize alignment bloat.
      
      row_purge_parse_undo_rec(): Remove some redundant code. Purge will
      be started by innodb_ddl_recovery_done(), at which point all
      necessary subsystems will have been initialized.
      
      trx_purge_rec_t::undo_rec: Point to const.
      
      Reviewed by: Vladislav Lesin
      9bb5d9fe
    • Marko Mäkelä's avatar
      MDEV-32050 preparation: Simplify ROLLBACK · ea42c4ba
      Marko Mäkelä authored
      undo_node_t::state: Replaced with bool is_temp.
      
      row_undo_rec_get(): Do not copy the undo log record.
      The motivation of the copying was to not hold latches on the undo pages
      and therefore to avoid deadlocks due to lock order inversion a.k.a.
      latching order violation: It is not allowed to wait for an index page latch
      while holding an undo page latch, because MVCC reads would first acquire
      an index page latch and then an undo page latch. But, in rollback, we
      do not actually need any latch on our own undo pages. The transaction
      that is being rolled back is the exclusive owner of its undo log records.
      They cannot be overwritten by other threads until the rollback is complete.
      Therefore, a buffer fix will protect the undo log record just fine,
      by preventing page eviction. We still must initially acquire a shared latch
      on each undo page, to avoid a race condition like the one that was fixed in
      commit b102872a.
      
      row_undo_ins_parse_undo_rec(): The first two bytes of the undo log record
      now are the pointer to the next record within the page, not a length.
      
      Reviewed by: Vladislav Lesin
      ea42c4ba
  2. 24 Oct, 2023 1 commit
    • Marko Mäkelä's avatar
      MDEV-32530 Race condition in lock_wait_rpl_report() · b78b77e7
      Marko Mäkelä authored
      After acquiring lock_sys.latch, always load trx->lock.wait_lock.
      It could have changed by another thread that did lock_rec_move()
      and released lock_sys.latch right before lock_sys.wr_lock_try()
      succeeded.
      
      This regression was introduced in
      commit e039720b (MDEV-32096).
      
      Reviewed by: Vladislav Lesin
      b78b77e7
  3. 23 Oct, 2023 3 commits
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · b21f52ee
      Marko Mäkelä authored
      b21f52ee
    • Marko Mäkelä's avatar
      MDEV-32552 Write-ahead logging is broken for freed pages · b5e43a1d
      Marko Mäkelä authored
      buf_page_free(): Flag the freed page as modified if it is found in
      the buffer pool.
      
      buf_flush_page(): If the page has been freed, ensure that the log
      for it has been durably written, before removing the page
      from buf_pool.flush_list.
      
      FindBlockX: Find also MTR_MEMO_PAGE_X_MODIFY in order to avoid an
      occasional failure of innodb.innodb_defrag_concurrent, which involves
      freeing and reallocating pages in the same mini-transaction.
      
      This fixes a regression that was introduced in
      commit a35b4ae8 (MDEV-15528).
      
      This logic was tested by commenting out the $shutdown_timeout line
      from a test and running the following:
      
      ./mtr --rr innodb.scrub
      rr replay var/log/mysqld.1.rr/mariadbd-0
      
      A breakpoint in the modified buf_flush_page() was hit, and the
      FIL_PAGE_LSN of that page had been last modified during the
      mtr_t::commit() of a mini-transaction where buf_page_free()
      had been executed on that page.
      b5e43a1d
    • Oleksandr Byelkin's avatar
      new CC v3.3 · 0a4103e6
      Oleksandr Byelkin authored
      0a4103e6
  4. 19 Oct, 2023 7 commits
    • Sergei Petrunia's avatar
      MDEV-32113: utf8mb3_key_col=utf8mb4_value cannot be used for ref · 4941ac91
      Sergei Petrunia authored
      (Variant#3: Allow cross-charset comparisons, use a special
      CHARSET_INFO to create lookup keys. Review input addressed.)
      
      Equalities that compare utf8mb{3,4}_general_ci strings, like:
      
        WHERE ... utf8mb3_key_col=utf8mb4_value    (MB3-4-CMP)
      
      can now be used to construct ref[const] access and also participate
      in multiple-equalities.
      This means that utf8mb3_key_col can be used for key-lookups when
      compared with an utf8mb4 constant, field or expression using '=' or
      '<=>' comparison operators.
      
      This is controlled by optimizer_switch='cset_narrowing=on', which is
      OFF by default.
      
      IMPLEMENTATION
      Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci.
      This is valid as any utf8mb3 value is also an utf8mb4 value.
      
      When making index lookup value for utf8mb3_key_col, we do "Charset
      Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are
      copied as-is, as they can be represented in utf8mb3. Characters that are
      outside the BMP cannot be represented in utf8mb3 and are replaced
      with U+FFFD, the "Replacement Character".
      
      In utf8mb4_general_ci, the Replacement Character compares as equal to any
      character that's not in BMP. Because of this, the constructed lookup value
      will find all index records that would be considered equal by the original
      condition (MB3-4-CMP).
      Approved-by: default avatarMonty <monty@mariadb.org>
      4941ac91
    • Monty's avatar
      MDEV-32476 LeakSanitizer errors in get_quick_select or Assertion ... · 6a674c31
      Monty authored
      Problem was that JOIN_TAB::cleanup() was not run because
      JOIN::top_join_tab_count was not set in case of early errors.
      
      Fixed by setting JOIN::tab_join_tab_count when JOIN_TAB's are allocated.
      
      Something that should eventually be fixed:
      - Cleaning up JOIN_TAB's is now done in 3 different loops.
        JOIN_TAB::cleanup() is only doing a partial cleanup. Other cleanups
        are done outside of JOIN_TAB::cleanup().
      
      The above should be fixed so that JOIN_TAB::cleanup() is freeing
      everything related to it's own memory, including all its sub JOIN_ TAB's.
      JOIN::cleanup() should only loop over all it's top JOIN_TAB's and call
      JOIN_TAB::cleanup() on these.
      This will greatly simplify and speedup the current code (as we now do some
      cleanup's twice).
      6a674c31
    • Monty's avatar
      Fixed crash in is_stat_table() when using hash joins. · a1b6befc
      Monty authored
      Other usage if persistent statistics is checking 'stats_is_read' in
      caller, which is why this was not noticed earlier.
      
      Other things:
      - Simplified no_stat_values_provided
      a1b6befc
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 6991b1c4
      Marko Mäkelä authored
      6991b1c4
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-31851 After crash recovery, undo tablespace fails to open · 85751ed8
      Thirunarayanan Balathandayuthapani authored
      srv_all_undo_tablespaces_open(): While opening the extra unused
      undo tablespaces, InnoDB should use ULINT_UNDEFINED instead of
      SRV_SPACE_ID_UPPER_BOUND.
      85751ed8
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-31851 After crash recovery, undo tablespace fails to open · dbba1bb1
      Thirunarayanan Balathandayuthapani authored
      recv_recovery_from_checkpoint_start(): InnoDB should add the
      redo log block header + trailer size while checking the	log
      sequence number in log file with log sequence number in the
      system tablespace first page.
      dbba1bb1
    • Marko Mäkelä's avatar
      MDEV-32144 fixup · 2d6dc65d
      Marko Mäkelä authored
      In commit 384eb570 the debug check
      was relaxed in trx_undo_header_create(), not in the intended function
      trx_undo_write_xid().
      2d6dc65d
  5. 18 Oct, 2023 2 commits
    • Marko Mäkelä's avatar
      MDEV-32511: Race condition between checkpoint and page write · cfd17881
      Marko Mäkelä authored
      fil_aio_callback(): Invoke fil_node_t::complete_write() before
      releasing any page latch, so that in case a log checkpoint is
      executed roughly concurrently with the first write into a file
      since the previous checkpoint, we will not miss a fdatasync()
      or fsync() call to make the write durable.
      cfd17881
    • Marko Mäkelä's avatar
      MDEV-32511 Assertion !os_aio_pending_writes() failed · bf7c6fc2
      Marko Mäkelä authored
      In MemorySanitizer builds of 10.10 and 10.11, we would rather often
      have the assertion fail in innodb_init() during mariadb-backup --prepare.
      The assertion could also fail during InnoDB startup, but less often.
      
      Before commit 685d958e in 10.8 the
      log file cleanup after a successfully applied backup is different,
      and the os_aio_pending_writes() assertion is in srv0start.cc.
      
      IORequest::write_complete(): Invoke node->complete_write() before
      releasing the page latch, so that a log checkpoint that is about to
      execute concurrently will not miss a fdatasync() or fsync() on the
      file, in case this was the first write since the last such call.
      
      create_log_file(), srv_start(): Replace the debug assertion with
      a debug check. For all intents and purposes, all writes could have
      been completed but some write_io_callback() may not have invoked
      io_slots::release() yet.
      bf7c6fc2
  6. 17 Oct, 2023 1 commit
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-31851 After crash recovery, undo tablespace fails to open · 3da5d047
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
      - InnoDB fails to open undo tablespace when page0 is corrupted
      and fails to throw error.
      
      Solution:
      =========
      - InnoDB throws DB_CORRUPTION error when InnoDB encounters
      page0 corruption of undo tablespace.
      
      - InnoDB restores the page0 of undo tablespace from
      doublewrite buffer if it encounters page corruption
      
      - Moved Datafile::restore_from_doublewrite() to
      recv_dblwr_t::restore_first_page(). So that undo
      tablespace and system tablespace can use this function
      instead of duplicating the code
      
      srv_undo_tablespace_open(): Returns 0 if file doesn't exist
      or ULINT_UNDEFINED if page0 is corrupted.
      3da5d047
  7. 16 Oct, 2023 2 commits
  8. 14 Oct, 2023 3 commits
  9. 13 Oct, 2023 4 commits
  10. 12 Oct, 2023 3 commits
  11. 10 Oct, 2023 4 commits
  12. 08 Oct, 2023 4 commits
    • Monty's avatar
      Fixed that log_slow.test works with view_protocol · b04af648
      Monty authored
      Part of the test did not work with view_protocol as the query written
      to the slow_log table is changed because of view_protocol.
      b04af648
    • Monty's avatar
      Fixed compiler warnings in connect/odbconn.cpp · 1dd6d9a0
      Monty authored
      1dd6d9a0
    • Monty's avatar
      MDEV-22243 type_test.type_test_double fails with 'NUMERIC_SCALE NULL' · 9d19b652
      Monty authored
      There where several reasons why the test failed:
      - Constructors for Field_double and Field_float changed an argument
        to the constructor instead of a the correct class variable.
      - gcc 7.5.0 produced wrong code when inlining Field_double constructor
        into Field_test_double constructor.
      
      Fixed by changing the correct class variable and make the constructors
      not inline to go around the gcc bug.
      9d19b652
    • Otto Kekalainen's avatar
      Fix merge commit 5ea5291d: No test file or result files should be executable · 8941bdc4
      Otto Kekalainen authored
      In commit 5ea5291d @sanja-byelkin for unknown reason switched the file mode
      for 3 Galera tzinfo related test files from 644 -> 755. This exists only
      from branch 10.6 onward:
      
          $ git checkout 10.5
          $ find mysql-test -executable -name *.test -or -executable -name *.result
          (no results)
          $ git checkout 10.6
          $ find mysql-test -executable -name *.test -or -executable -name *.result
          mysql-test/suite/galera/t/mysql_tzmysql-test/suite/galera/t/mysql_tzinfo_to_sql.test
          mysql-test/suite/galera/t/mariadb_tzinfo_to_sql.test
          mysql-test/suite/galera/r/mariadb_tzinfo_to_sql.resultinfo_to_sql.test
      
      mysql-test/suite/galera/t/mariadb_tzinfo_to_sql.test
      mysql-test/suite/galera/r/mariadb_tzinfo_to_sql.result
      
      No test file nor test result file should be executable, so run chmod -x
      on them.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services, Inc.
      8941bdc4