1. 03 Feb, 2024 2 commits
  2. 02 Feb, 2024 1 commit
    • Vladislav Vaintroub's avatar
      MDEV-33075 Resolve server shutdown issues on macOS, Solaris, and FreeBSD · 2f5174e5
      Vladislav Vaintroub authored
      This commit addresses multiple server shutdown problems observed on macOS,
      Solaris, and FreeBSD:
      
      1. Corrected a non-portable assumption where socket shutdown was expected
      to wake up poll() with listening sockets in the main thread.
      
      Use more robust self-pipe to wake up poll() by writing to the pipe's write
      end.
      
      2. Fixed a random crash on macOS from pthread_kill(signal_handler)
      when the signal_handler was detached and the thread had already exited.
      
      Use more robust `kill(getpid(), SIGTERM)` to wake up the signal handler
      thread.
      
      3. Made sure, that signal handler thread always exits once `abort_loop` is
      set, and also calls `my_thread_end()` and clears `signal_thread_in_use`
      when exiting.
      
      This fixes warning "1 thread did not exit"  by `my_global_thread_end()`
      seen on FreeBSD/macOS when the process is terminated via signal.
      
      Additionally, the shutdown code underwent light refactoring
      for better readability and maintainability:
      - Modified `break_connect_loop()` to no longer wait for the main thread,
        aligning behavior with Windows (since 10.4).
      - Removed dead code related to the unused `USE_ONE_SIGNAL_HAND`
        preprocessor constant.
      - Eliminated support for `#ifndef HAVE_POLL` in `handle_connection_sockets`
        This code is also dead, since 10.4
      2f5174e5
  3. 29 Jan, 2024 6 commits
  4. 27 Jan, 2024 1 commit
    • Kristian Nielsen's avatar
      MDEV-4991: GTID binlog indexing · d039346a
      Kristian Nielsen authored
      Improve the performance of slave connect using B+-Tree indexes on each binlog
      file. The index allows fast lookup of a GTID position to the corresponding
      offset in the binlog file, as well as lookup of a position to find the
      corresponding GTID position.
      
      This eliminates a costly sequential scan of the starting binlog file
      to find the GTID starting position when a slave connects. This is
      especially costly if the binlog file is not cached in memory (IO
      cost), or if it is encrypted or a lot of slaves connect simultaneously
      (CPU cost).
      
      The size of the index files is generally less than 1% of the binlog data, so
      not expected to be an issue.
      
      Most of the work writing the index is done as a background task, in
      the binlog background thread. This minimises the performance impact on
      transaction commit. A simple global mutex is used to protect index
      reads and (background) index writes; this is fine as slave connect is
      a relatively infrequent operation.
      
      Here are the user-visible options and status variables. The feature is on by
      default and is expected to need no tuning or configuration for most users.
      
      binlog_gtid_index
        On by default. Can be used to disable the indexes for testing purposes.
      
      binlog_gtid_index_page_size (default 4096)
        Page size to use for the binlog GTID index. This is the size of the nodes
        in the B+-tree used internally in the index. A very small page-size (64 is
        the minimum) will be less efficient, but can be used to stress the
        BTree-code during testing.
      
      binlog_gtid_index_span_min (default 65536)
        Control sparseness of the binlog GTID index. If set to N, at most one
        index record will be added for every N bytes of binlog file written.
        This can be used to reduce the number of records in the index, at
        the cost only of having to scan a few more events in the binlog file
        before finding the target position
      
      Two status variables are available to monitor the use of the GTID indexes:
      
        Binlog_gtid_index_hit
        Binlog_gtid_index_miss
      
      The "hit" status increments for each successful lookup in a GTID index.
      The "miss" increments when a lookup is not possible. This indicates that the
      index file is missing (eg. binlog written by old server version
      without GTID index support), or corrupt.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      d039346a
  5. 24 Jan, 2024 1 commit
  6. 23 Jan, 2024 1 commit
  7. 22 Jan, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-7850: Extend GTID Binlog Events with Thread Id · c37b2087
      Brandon Nesterenko authored
      This patch augments Gtid_log_event with the user thread-id.
      In particular that compensates for the loss of this info in
      Rows_log_events.
      
      Gtid_log_event::thread_id gets visible in mysqlbinlog output like
      
        #231025 16:21:45 server id 1  end_log_pos 537 CRC32 0x1cf1d963  GTID 0-1-2 ddl thread_id=10
      
      as 64 bit unsigned integer.
      
      While the size of Gtid event has grown by 8-9 bytes
      replication from OLD <-> NEW is not affected by it.
      
      This work was started by the late Sujatha Sivakumar.
      Brandon Nesterenko took it over, reviewed initial patches and extended
      the work.
      
      Reviewed-by: <andrei.elkin@mariadb.com>
      c37b2087
  8. 18 Jan, 2024 1 commit
    • Libing Song's avatar
      MDEV-32894 mysqlbinlog flashback support binlog_row_image FULL_NODUP mode · 8bf9f218
      Libing Song authored
      Summary
      =======
      With FULL_NODUP mode, before image inclues all columns and after
      image inclues only the changed columns. flashback will swap the
      value of changed columns from after image to before image.
      For example:
        BI: c1, c2, c3_old, c4_old
        AI: c3_new, c4_new
      flashback will reconstruct the before and after images to
        BI: c1, c2, c3_new, c4_new
        AI: c3_old, c4_old
      
      Implementation
      ==============
      When parsing the before and after image, position and length of
      the fields are collected into ai_fields and bi_fields, if it is an
      Update_rows_event and the after image doesn't includes all columns.
      
      The changed fields are swapped between bi_fields and ai_fields.
      Then it recreates the before image and after image by using
      bi_fields and ai_fields. nullbit will be set to 1 if the
      field is NULL, otherwise nullbit will be 0.
      
      It also optimized flashback a little bit.
      - calc_row_event_length is used instead of print_verbose_one_row
      - swap_buff1 and swap_buff2 are removed.
      8bf9f218
  9. 17 Jan, 2024 1 commit
  10. 12 Jan, 2024 1 commit
    • Libing Song's avatar
      MDEV-33049 Assertion `marked_for_write_or_computed()' failed in bool · be6d48fd
      Libing Song authored
                 Field_new_decimal::store_value(const my_decimal*, int*)
      
      Analysis
      ========
      When rpl applier is unpacking a before row image, Field::reset() will be
      called before setting a field to null if null bit of the field is set in
      the row image. For Field_new_decimal::reset(), it calls
      Field_new_decimal::store_value() to reset the value. store_value() asserts
      that the field is in the write_set bitmap since it thinks the field is
      updating.
      
      But that is not true for the row image generated in FULL_NODUP
      mode. In the mode, the before image includes all fields and the after
      image includes only updated fields.
      
      Fix
      ===
      In the case unpacking binlog row images, the assertion is meaningless.
      So the unpacking field is marked in write_set temporarily to avoid the
      assertion failure.
      be6d48fd
  11. 10 Jan, 2024 10 commits
    • Marko Mäkelä's avatar
      Merge 11.3 into 11.4 · d136169e
      Marko Mäkelä authored
      d136169e
    • Marko Mäkelä's avatar
      Merge 11.2 into 11.3 · af4f9dae
      Marko Mäkelä authored
      af4f9dae
    • Marko Mäkelä's avatar
      Merge 11.1 into 11.2 · e4cb1e32
      Marko Mäkelä authored
      e4cb1e32
    • Marko Mäkelä's avatar
      Merge 11.0 into 11.1 · c3a546e9
      Marko Mäkelä authored
      c3a546e9
    • Marko Mäkelä's avatar
      Merge 10.11 into 11.0 · c2da55ac
      Marko Mäkelä authored
      c2da55ac
    • Marko Mäkelä's avatar
      MDEV-26195 fixup: Remove page_no_t · 338ed5c4
      Marko Mäkelä authored
      338ed5c4
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.11 · 1eb11da3
      Marko Mäkelä authored
      1eb11da3
    • Marko Mäkelä's avatar
      MDEV-33112 innodb_undo_log_truncate=ON is blocking page write · 3613fb2a
      Marko Mäkelä authored
      When innodb_undo_log_truncate=ON causes an InnoDB undo tablespace
      to be truncated, we must guarantee that the undo tablespace will
      be rebuilt atomically: After mtr_t::commit_shrink() has durably
      written the mini-transaction that rebuilds the undo tablespace,
      we must not write any old pages to the tablespace.
      
      To guarantee this, in trx_purge_truncate_history() we used to
      traverse the entire buf_pool.flush_list in order to acquire
      exclusive latches on all pages for the undo tablespace that
      reside in the buffer pool, so that those pages cannot be written
      and will be evicted during mtr_t::commit_shrink(). But, this
      traversal may interfere with the page writing activity of
      buf_flush_page_cleaner(). It would be better to lazily discard
      the old pages of the truncated undo tablespace.
      
      fil_space_t::is_being_truncated, fil_space_t::clear_stopping(): Remove.
      
      fil_space_t::create_lsn: A new field, identifying the LSN of the
      latest rebuild of a tablespace.
      
      buf_page_t::flush(), buf_flush_try_neighbors(): Evict pages whose
      FIL_PAGE_LSN is below fil_space_t::create_lsn.
      
      mtr_t::commit_shrink(): Update fil_space_t::create_lsn and
      fil_space_t::size right before the log is durably written and the
      tablespace file is being truncated.
      
      fsp_page_create(), trx_purge_truncate_history(): Simplify the logic.
      
      Reviewed by: Thirunarayanan Balathandayuthapani, Vladislav Lesin
      Performance tested by: Axel Schwenke
      Correctness tested by: Matthias Leich
      3613fb2a
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      MDEV-33137: Assertion end_lsn == page_lsn failed in recv_recover_page · 4cbf75dd
      Marko Mäkelä authored
      trx_purge_free_segment(), trx_purge_truncate_rseg_history():
      Do not claim that the blocks will be modified in the mini-transaction,
      because that will not always be the case. Whenever there is a
      modification, mtr_t::set_modified() will flag it.
      
      The debug assertion that failed in recovery is checking that all
      changes to data pages are covered by log records. Due to these
      incorrect calls, we would unnecessarily write unmodified data pages,
      which is something that commit 05fa4558
      aims to avoid.
      
      The incorrect calls had originally been added in
      commit de31ca6a (MDEV-32820) and
      commit 86767bcc (MDEV-29593).
      
      Reviewed by: Vladislav Lesin
      Tested by: Elena Stepanova
      4cbf75dd
  12. 09 Jan, 2024 7 commits
  13. 08 Jan, 2024 4 commits
  14. 05 Jan, 2024 3 commits