1. 12 Mar, 2024 1 commit
  2. 09 Mar, 2024 1 commit
  3. 08 Mar, 2024 2 commits
    • Monty's avatar
      MDEV-33623 Partitioning is broken on big endian architectures · f838b2d7
      Monty authored
      MDEV-33502 Slowdown when running nested statement with many partitions
      caused this error as I failed to take into account bigendian architectures.
      
      This patch also introduces bitmap_import() and bitmap_export() to be used
      when one wants to store bitmaps in files/logs in a portable way.
      Reviewed-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      f838b2d7
    • Monty's avatar
      MDEV-33620 Improve times and states in show processlist for replication · 9a132d42
      Monty authored
      This will makes it easier to find out what replication workers are
      doing and what they are waiting for.
      
      Things changed in processlist:
      - Slave_SQL time was not consistent. Now time for state "Slave has
        read all relay log; waiting for more updates" shows how long it has
        waited for getting the next event.
      - Slave_worker threads did often show "Closing tables" for a long
        time.  Now the state is reverted to the previous state after
        "Closing tables" is done.
      - Commit and Rollback states where not shown for replication (and some
        other threads). Now Commit and Rollback states are always shown and
        the state is reverted to previous state when the Commit/Rollback
        have finished.
      
      Code changes:
      - Added thd->set_time_for_next_stage() for parallel replication when
        when starting to wait for prior transactions to commit, group commit,
        and FTWRL and for free space in thread pool.
        Before we reset the time only after the above events.
      - Moved THD_STAGE_INFO(stage_rollback) and THD_STAGE_INFO(stage_commit)
        from sql_parse.cc to transaction.cc to ensure this is done for
        all commits and not only 'normal connection queries'.
      
      Test case changes:
      - close_thread_tables() reverting stage to previous stage caused the
        counter in performance_schema to be increased. In many case it is
        the 'sql/starting' stage that was effected.
      - We only change to "Commit" stage if there is a need for a commit.
        This caused some "Commit" stages to disapper from perfschema reports.
      
      TODO in 11.#:
      - Slave_IO always showes "Waiting for master to send event" and the time is
        from SLAVE START. We should in 11.# change this to be the time since
        reading the last event.
      9a132d42
  4. 07 Mar, 2024 1 commit
    • mariadb-DebarunBanerjee's avatar
      MDEV-33593 Auto increment deadlock error causes ASSERT in subsequent save point · afe96329
      mariadb-DebarunBanerjee authored
      The issue here is ha_innobase::get_auto_increment() could cause a
      deadlock involving auto-increment lock and rollback the transaction
      implicitly. For such cases, storage engines usually call
      thd_mark_transaction_to_rollback() to inform SQL engine about it which
      in turn takes appropriate actions and close the transaction. In innodb,
      we call it while converting Innodb error code to MySQL.
      
      However, since ::innobase_get_autoinc() returns void, we skip the call
      for error code conversion and also miss marking the transaction for
      rollback for deadlock error. We assert eventually while releasing a
      savepoint as the transaction state is not active.
      
      Since convert_error_code_to_mysql() is handling some generic error
      handling part, like invoking the callback when needed, we should call
      that function in ha_innobase::get_auto_increment() even if we don't
      return the resulting mysql error code back.
      afe96329
  5. 06 Mar, 2024 2 commits
    • Monty's avatar
      Fixed some mtr results found in Jenins after MDEV-333582 push · 0df4651c
      Monty authored
      MDEV-33582 Add more warnings to be able to better diagnose network issues
      
      - Disabled "Semisync ack receiver got hangup" warning
        - One could get this warning from semisync if running
          mtr --mysqld=log-warnings=3 rpl.rpl_semi_sync_shutdown_await_ack
      - Fixed result file for engines/funcs/rpl_get_lock.test
      0df4651c
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-32445 InnoDB may corrupt its log before upgrading it on startup · 6e5333fc
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
       During upgrade, InnoDB does write the redo log for adjusting
      the tablespace size or tablespace flags even before the log
      has upgraded to configured format. This could lead to data
      inconsistent if any crash happened during upgrade process.
      
      Fix:
      ===
      srv_start(): Write the tablespace flags adjustment, increased
      tablespace size redo log only after redo log upgradation.
      
      log_write_low(), log_reserve_and_write_fast(): Check whether
      the redo log is in physical format.
      6e5333fc
  6. 05 Mar, 2024 1 commit
    • Monty's avatar
      MDEV-33582 Add more warnings to be able to better diagnose network issues · 567c0973
      Monty authored
      Warnings are added to net_server.cc when
      global_system_variables.log_warnings >= 4.
      
      When the above condition holds then:
      - All communication errors from net_serv.cc is also written to the
        error log.
      - In case of a of not being able to read or write a packet, a more
        detailed error is given.
      
      Other things:
      - Added detection of slaves that has hangup to Ack_receiver::run()
      - vio_close() is now first marking the socket closed before closing it.
        The reason for this is to ensure that the connection that gets a read
        error can check if the reason was that the socket was closed.
      - Add a new state to vio to be able to detect if vio is acive, shutdown or
        closed. This is used to detect if socket is closed by another thread.
      - Testing of the new warnings is done in rpl_get_lock.test
      - Suppress some of the new warnings in mtr to allow one to run some of
        the tests with -mysqld=--log-warnings=4. All test in the 'rpl' suite
        can now be run with this option.
       - Ensure that global.log_warnings are restored at test end in a way
         that allows one to use mtr --mysqld=--log-warnings=4.
      
      Reviewed-by: <serg@mariadb.org>,<brandon.nesterenko@mariadb.com>
      567c0973
  7. 04 Mar, 2024 2 commits
  8. 03 Mar, 2024 1 commit
  9. 01 Mar, 2024 3 commits
    • Monty's avatar
      Fixed random failure in main.kill_processlist-6619 · 8b3f470c
      Monty authored
      The problem was that SHOW PROCESSLIST was done before the command of
      the default connection was cleared.
      
      Reviewer: Sergei Golubchik <serg@mariadb.org>
      8b3f470c
    • Monty's avatar
      Fixed memory leaks in embedded server and mysqltest · 33dcf815
      Monty authored
      This commit fixes the following issues:
      - memory leak checking enabled for mysqltest. This cover all cases except
        calls to 'die()' that only happens in case of internal failures in
        mysqltest. die() is not called anymore in the result files differs.
      - One can now run mtr --embedded without failures (this crashed or hang
        before)
      - cleanup_and_exit() has a new parameter that indicates that it is called
        from die(), in which case we should not do memory leak checks. We now
        always call cleanup_and_exit() instead of exit() to be able to free up
        memory and discover memory leaks.
      - Lots of new assert to catch error conditions
      - More DBUG statements.
      - Fixed that all results are freed in mysqltest (Fixed a memory leak in
        mysqltest when using prepared statements).
      - Fixed race condition in do_stmt_close() that caused embedded server
        to not free memory. (Memory leak in mysqltest with embedded server).
      - Fixed two memory leaks in embedded server when using prepared statements.
        These memory leaks caused timeout hangs in mtr when server was compiled
        with safemalloc. This issue was not noticed (except as timeouts) as
        memory report checking was done but output of it was disabled.
      33dcf815
    • Tony Chen's avatar
      MDEV-26923 Check all invalid config options · 32546877
      Tony Chen authored
      Previously, the behavior was to error out on the first invalid option
      encountered. With this change, a best effort approach is made so that
      all invalid options processed will be printed before exiting.
      
      There is a caveat. The options are processed many times at varying
      stages of server startup because the server is not aware of all valid
      options immediately (e.g. plugins have to be loaded first before the
      server knows what are the available plugin options). So, there are some
      options that the server can determine are invalid "early" on, and there
      are some options that the server cannot determine are invalid until
      "later" on. For example, the server can determine an option such as
      `--a` is an ambiguous option very early on but an option such as
      `--this-does-not-match-any-option` cannot be labelled as invalid until
      the server is aware of all available options.
      
      Thus, it is possible that the server will still fail before printing out
      all "invalid" options. You can see this by passing `--a
      --obvious-invalid-option`.
      
      Test cases were added to `mysqld_option_err.test` to validate that
      multiple invalid options will be displayed in the error message.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services.
      32546877
  10. 29 Feb, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-33546: Rpl_semi_sync_slave_status is ON When Replication Is Not Configured · bd604add
      Brandon Nesterenko authored
      If a server has a default configuration (e.g. in a my.cnf file) with
      rpl_semi_sync_slave_enabled set, on server start, the corresponding
      rpl_semi_sync_slave_status variable will also be ON initially, even
      if the slave was never configured/started. This is because the
      Repl_semi_sync_slave initialization logic (function init_object())
      sets the running status to the enabled value during
      init_server_components().
      
      This patch fixes this by removing the statement which sets the
      semi-sync slave running status from the initialization logic. An
      additional change needed from this is to semi-sync recovery: this
      status variable was used as a condition to determine binlog
      truncation during server recovery. This patch also switches this
      condition to reference the global rpl_semi_sync_slave_enabled
      variable. Though note, the semi-sync recovery condition is to be
      changed entirely with the MDEV-33424 agenda.
      
      Reviewed By:
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      bd604add
  11. 28 Feb, 2024 2 commits
    • Sergei Petrunia's avatar
      MDEV-33502: part#4: Dont make redundant extra(HA_EXTRA_[NO]_KEYREAD) calls · 31463f11
      Sergei Petrunia authored
      In most cases, ha_partition forwards calls to extra() to all
      locked_partitions. It doesn't make sense to forward some calls for
      partitions that were pruned away.
      This patch introduces ha_partition::loop_read_partitions and makes
      these calls use it:
      
      - ha_partition::extra_opt(HA_EXTRA_KEYREAD)
      - ha_partition::extra(HA_EXTRA_KEYREAD)
      - ha_partition::extra(HA_EXTRA_NO_KEYREAD)
      
      Reviewed-by: Monty
      31463f11
    • Marko Mäkelä's avatar
      MDEV-33508 Performance regression due to frequent scan of full buf_pool.flush_list · 0772ac1f
      Marko Mäkelä authored
      buf_flush_page_cleaner(): Remove a loop that had originally been added
      in commit 9d146652 (MDEV-32029) and made
      redundant by commit 5b53342a (MDEV-32588).
      
      Starting with commit d34479dc (MDEV-33053)
      this loop would cause a significant performance regression in workloads
      where buf_pool.need_LRU_eviction() constantly holds in
      buf_flush_page_cleaner().
      
      Thanks to Steve Shaw of Intel for noticing this.
      
      Reviewed by: Debarun Banerjee
      Tested by: Matthias Leich
      0772ac1f
  12. 27 Feb, 2024 8 commits
    • Monty's avatar
    • Monty's avatar
      Fixed crash in connect.misc with embedded server · 89aae15d
      Monty authored
      The problem was that connect tried to recusiverly call
      emb_advanced_command(), which was not supported.
      
      Fixed by adding support for recursive calls.
      89aae15d
    • Monty's avatar
      Updated test cases result for s3.parition · 0c079f4f
      Monty authored
      MDEV-21472 ALTER TABLE ... ANALYZE PARTITION ... with EITS reads and locks all rows
      0c079f4f
    • Monty's avatar
      Optimize performance of my_bitmap · b5d65fc1
      Monty authored
      MDEV-33502 Slowdown when running nested statement with many partitions
      
      This change was triggered to help some MariaDB users with close to
      10000 bits in their bitmaps.
      
      - Change underlaying storage to be 64 bit instead of 32bit.
        - This reduses number of loops to scan bitmaps.
        - This can cause some bitmaps to be 4 byte large.
      - Ensure that all not used top-bits are always 0 (simplifes code as
        the last 64 bit storage is not a special case anymore).
      - Use my_find_first_bit() to find the first set bit which is much faster
        than scanning trough things byte by byte and then bit by bit.
      
      Other things:
      - Added a bool to remember if my_bitmap_init() did allocate the bitmap
        array. my_bitmap_free() will only free arrays it did allocate.
        This allowed me to remove setting 'bitmap=0' before calling
        my_bitmap_free() for cases where the bitmap's where allocated externally.
      - my_bitmap_init() sets bitmap to 0 in case of failure.
      - Added 'universal' asserts to most bitmap functions.
      - Change all remaining calls to bitmap_init() to my_bitmap_init().
        - To finish the change from 2014.
      - Changed all usage of uint32 in my_bitmap.h to my_bitmap_map.
      - Updated bitmap_copy() to handle bitmaps of different size.
      - Removed const from bitmap_exists_intersection() as this caused casts
        on all usage.
      - Removed not used function bitmap_set_above().
      - Renamed create_last_word_mask() to create_last_bit_mask() (to match
        name changes in my_bitmap.cc)
      - Extended bitmap-t with test for more bitmap functions.
      b5d65fc1
    • mariadb-DebarunBanerjee's avatar
      MDEV-33011 mariabackup --backup: FATAL ERROR: ... Can't open datafile cool_down/t3 · 96966976
      mariadb-DebarunBanerjee authored
      The root cause is the WAL logging of file operation when the actual
      operation fails afterwards. It creates a situation with a log entry for
      a operation that would always fail. I could simulate both the backup
      scenario error and Innodb recovery failure exploiting the weakness.
      
      We are following WAL for file rename operation and once logged the
      operation must eventually complete successfully, or it is a major
      catastrophe. Right now, we fail for rename and handle it as normal error
      and it is the problem.
      
      I created a patch to address RENAME operation to a non existing schema
      where the destination schema directory is missing. The patch checks for
      the missing schema before logging in an attempt to avoid the failure
      after WAL log is written/flushed. I also checked that the schema cannot
      be dropped or there cannot be any race with other rename to the same
      file. This is protected by the MDL lock in SQL today.
      
      The patch should this be a good improvement over the current situation
      and solves the issue at hand.
      96966976
    • Monty's avatar
      Optimize handler_stats_disable() when handler_stats are already disabled · d4e1731f
      Monty authored
      MDEV-33502 Slowdown when running nested statement with many partitions
      d4e1731f
    • Monty's avatar
      Have ha_partition ignore HA_EXTRA..CHILDREN extra() calls if no myisamrg · a8f6b86c
      Monty authored
      MDEV-33502 Slowdown when running nested statement with many partitions
      
      Optimization for tables with a lot of partitions
      a8f6b86c
    • Marko Mäkelä's avatar
      MDEV-24671 fixup: Remove srv_max_n_threads · 71834ccb
      Marko Mäkelä authored
      The variable srv_max_n_threads lost its usefulness in
      commit db006a9a (MDEV-21452)
      and commit e71e6133 (MDEV-24671).
      71834ccb
  13. 26 Feb, 2024 2 commits
    • Igor Babaev's avatar
      MDEV-31276 Wrong warnings on 2-nd execution of PS for query with GROUP_CONCAT · 8778a83e
      Igor Babaev authored
      If a query with GROUP_CONCAT is executed then the server reports a warning
      every time when the length of the result of this function exceeds the set
      value of the system variable group_concat_max_len. This bug led to the set
      of warnings from the second execution of the prepared statement that did
      not coincide with the one from the first execution if the executed query
      was a grouping query over a join of tables using GROUP_CONCAT function and
      join cache was not allowed to be employed.
      The descrepancy of the sets of warnings was due to lack of cleanup for
      Item_func_group_concat::row_count after execution of the query.
      
      Approved by Oleksandr Byelkin <sanja@mariadb.com>
      8778a83e
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-14193 innodb.log_file_name failed in buildbot with exception · 2c5f3bbe
      Thirunarayanan Balathandayuthapani authored
      Problem:
      =======
      - innodb.log_file_name fails if it executes after
      innodb.lock_insert_into_empty in few cases.
      innodb.lock_insert_into_empty test case failed to
      cleanup the table t2. Rollback of create..select fails
      to remove the table when it fails to acquire the
      innodb statistics table. This leads to rename table
      in log_file_name test case fails.
      
      Solution:
      ========
      - Cleanup the table t2 explictly after resetting
      innodb_lock_wait_timeout variable in
      innodb.lock_insert_into_empty test case.
      2c5f3bbe
  14. 23 Feb, 2024 3 commits
    • Alexander Barkov's avatar
      MDEV-33496 Out of range error in AVG(YEAR(datetime)) due to a wrong data type · e63311c2
      Alexander Barkov authored
      Functions extracting non-negative datetime components:
      
      - YEAR(dt),        EXTRACT(YEAR FROM dt)
      - QUARTER(td),     EXTRACT(QUARTER FROM dt)
      - MONTH(dt),       EXTRACT(MONTH FROM dt)
      - WEEK(dt),        EXTRACT(WEEK FROM dt)
      - HOUR(dt),
      - MINUTE(dt),
      - SECOND(dt),
      - MICROSECOND(dt),
      - DAYOFYEAR(dt)
      - EXTRACT(YEAR_MONTH FROM dt)
      
      did not set their max_length properly, so in the DECIMAL
      context they created a too small DECIMAL column, which
      led to the 'Out of range value' error.
      
      The problem is that most of these functions historically
      returned the signed INT data type.
      
      There were two simple ways to fix these functions:
      1. Add +1 to max_length.
         But this would also change their size in the string context
         and create too long VARCHAR columns, with +1 excessive size.
      
      2. Preserve max_length, but change the data type from INT to INT UNSIGNED.
         But this would break backward compatibility.
         Also, using UNSIGNED is generally not desirable,
         it's better to stay with signed when possible.
      
      This fix implements another solution, which it makes all these functions
      work well in all contexts: int, decimal, string.
      
      Fix details:
      
      - Adding a new special class Type_handler_long_ge0 - the data type
        handler for expressions which:
        * should look like normal signed INT
        * but which known not to return negative values
        Expressions handled by Type_handler_long_ge0 store in Item::max_length
        only the number of digits, without adding +1 for the sign.
      
      - Fixing Item_extract to use Type_handler_long_ge0
        for non-negative datetime components:
         YEAR, YEAR_MONTH, QUARTER, MONTH, WEEK
      
      - Adding a new abstract class Item_long_ge0_func, for functions
        returning non-negative datetime components.
        Item_long_ge0_func uses Type_handler_long_ge0 as the type handler.
        The class hierarchy now looks as follows:
      
      Item_long_ge0_func
        Item_long_func_date_field
          Item_func_to_days
          Item_func_dayofmonth
          Item_func_dayofyear
          Item_func_quarter
          Item_func_year
        Item_long_func_time_field
          Item_func_hour
          Item_func_minute
          Item_func_second
          Item_func_microsecond
      
      - Cleanup: EXTRACT(QUARTER FROM dt) created an excessive VARCHAR column
        in string context. Changing its length from 2 to 1.
      e63311c2
    • Oleg Smirnov's avatar
      5b8493ba
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-30655 IMPORT TABLESPACE fails with column count or index count mismatch · e309e024
      Thirunarayanan Balathandayuthapani authored
      update_vcol_pos(): pass table id as table_id_t instead of ulint.
      e309e024
  15. 22 Feb, 2024 1 commit
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-33462 Server aborts while altering an InnoDB statistics table · e66928ab
      Thirunarayanan Balathandayuthapani authored
      Problem:
      =======
      - When online alter of InnoDB statistics table happens,
      any transaction which updates the statistics table
      has to read the undo log and log the DML changes during
      transaction commit. Applying undo log
      (UndorecApplier::apply_undo_rec) requires a shared
      lock on dictionary cache but dict_stats_save() already
      holds write lock on dictionary cache. This leads to
      abort of server during commit of statistics table changes.
      
      Solution:
      ========
      - Disallow LOCK=NONE operation for the InnoDB statistics table.
      The reasoning is that statistics tables are typically
      rather small, so any blocking would be rather short.
      Writes to the statistics tables should be a rare operation.
      e66928ab
  16. 21 Feb, 2024 3 commits
    • Igor Babaev's avatar
      MDEV-31277 Wrong result on 2-nd execution of PS to select from view using derived · d57c44f6
      Igor Babaev authored
      As a result of this bug the second execution of the prepared statement
      created for select from materialized view could return a wrong result set if
      - the specification of the view used a left join
      - an inner table the left join was a mergeable derived table
      - the derived table contained a constant column.
      
      The problem appeared because the flag 'maybe-null' of the wrapper
      Item_direct_view_ref constructed for the constant field of the mergeable
      derived table was not set to 'true' on the second execution of the
      prepared statement.
      
      The patch always sets this flag properly when calling the function
      Item_direct_view_ref::set_null_ref-table(). The latter is invoked in
      Item_direct_view_ref constructor if it is created for some reference of
      a constant column belonging to a mergeable derived table.
      
      Approved by Oleksandr Byelkin <sanja@mariadb.com>
      d57c44f6
    • Marko Mäkelä's avatar
      MDEV-24167 fixup: srw_lock_debug for SUX_LOCK_GENERIC · 042c3fc4
      Marko Mäkelä authored
      srw_lock_debug::have_rd(), srw_lock_debug::have_wr():
      For SUX_LOCK_GENERIC (no futex based synchronization primitives),
      we cannot check if the underlying srw_lock is held by us.
      
      Thanks to Dmitry Shulga for pointing out this build failure.
      042c3fc4
    • Yuchen Pei's avatar
      0f0da95d
  17. 20 Feb, 2024 5 commits
    • Brandon Nesterenko's avatar
      MDEV-33500: rpl.rpl_parallel_sbm can fail on slow machines, e.g. MSAN/Valgrind builders · b04c8575
      Brandon Nesterenko authored
      In an addition to test rpl.rpl_parallel_sbm added by MDEV-32265, the
      test uses sleep statements alone to test Seconds_Behind_Master with
      delayed replication. On slow running machines, the test can pass the
      intended MASTER_DELAY duration and Seconds_Behind_Master can become
      0, when the test expects the transaction to still be actively in a
      delaying state.
      
      This can be consistently reproduced by adding a sleep statement
      before the call to
      
      --let = query_get_value(SHOW SLAVE STATUS, Seconds_Behind_Master, 1)
      
      to sleep past the delay end point.
      
      This patch fixes this by locking the table which the delayed
      transaction targets so Second_Behind_Master cannot be updated before
      the test reads it for validation.
      b04c8575
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-30655 IMPORT TABLESPACE fails with column count or index count mismatch · 903ae300
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
      Currently import operation fails with schema mismatch when
      cfg file has hidden fts document id and hidden fts document index.
      
      Fix:
      ====
      To fix this issue, simply add the fts doc id column,
      indexes in table definition and try to import the table.
      In case of success:
      1) update the fts document id in sys columns.
      2) update the number of columns in sys tables.
      3) insert the new fts index entry in sys indexes table
      and sys fields.
      4) Reload the table with new table definition
      903ae300
    • Monty's avatar
    • Monty's avatar
      Get rid of error when running mariadb-install-db with --log-bin · 90bbeafb
      Monty authored
      This removes the error:
      "Failed to load slave replication state from table mysql.gtid_slave_pos:
      1017: Can't find file: './mysql/' (errno: 2 "No such file or directory")
      90bbeafb
    • Andrew Daugherity's avatar
      fix markdown headings · 75c0f951
      Andrew Daugherity authored
      Should be h2 rather than h1, and GitHub requires an intervening space.
      75c0f951
  18. 19 Feb, 2024 1 commit