1. 27 Jun, 2024 2 commits
    • Marko Mäkelä's avatar
      MDEV-33894: Resurrect innodb_log_write_ahead_size · 4ca355d8
      Marko Mäkelä authored
      As part of commit 685d958e (MDEV-14425)
      the parameter innodb_log_write_ahead_size was removed, because it was
      thought that determining the physical block size would be a sufficient
      replacement.
      
      However, we can only determine the physical block size on Linux or
      Microsoft Windows. On some file systems, the physical block size
      is not relevant. For example, XFS uses a block size of 4096 bytes
      even if the underlying block size may be smaller.
      
      On Linux, we failed to determine the physical block size if
      innodb_log_file_buffered=OFF was not requested or possible.
      This will be fixed.
      
      log_sys.write_size: The value of the reintroduced parameter
      innodb_log_write_ahead_size. To keep it simple, this is read-only
      and a power of two between 512 and 4096 bytes, so that the previous
      alignment guarantees are fulfilled. This will replace the previous
      log_sys.get_block_size().
      
      log_sys.block_size, log_t::get_block_size(): Remove.
      
      log_t::set_block_size(): Ensure that write_size will not be less
      than the physical block size. There is no point to invoke this
      function with 512 or less, because that is the minimum value of
      write_size.
      
      innodb_params_adjust(): Add some disabled code for adjusting
      the minimum value and default value of innodb_log_write_ahead_size
      to reflect the log_sys.write_size.
      
      log_t::set_recovered(): Mark the recovery completed. This is the
      place to adjust some things if we want to allow write_size>4096.
      
      log_t::resize_write_buf(): Refer to write_size.
      
      log_t::resize_start(): Refer to write_size instead of get_block_size().
      
      log_write_buf(): Simplify some arithmetics and remove a goto.
      
      log_t::write_buf(): Refer to write_size. If we are writing less than
      that, do not switch buffers, but keep writing to the same buffer.
      Move some code to improve the locality of reference.
      
      recv_scan_log(): Refer to write_size instead of get_block_size().
      
      os_file_create_func(): For type==OS_LOG_FILE on Linux, always invoke
      os_file_log_maybe_unbuffered(), so that log_sys.set_block_size() will
      be invoked even if we are not attempting to use O_DIRECT.
      
      recv_sys_t::find_checkpoint(): Read the entire log header
      in a single 12 KiB request into log_sys.buf.
      
      Tested with:
      ./mtr --loose-innodb-log-write-ahead-size=4096
      ./mtr --loose-innodb-log-write-ahead-size=2048
      4ca355d8
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.11 · 27a33666
      Marko Mäkelä authored
      27a33666
  2. 26 Jun, 2024 3 commits
  3. 25 Jun, 2024 5 commits
    • Yuchen Pei's avatar
      ad0ee8cd
    • Yuchen Pei's avatar
      MDEV-34361 Split my.cnf in the spider suite. · 01289dac
      Yuchen Pei authored
      Just like the spider/bugfix suite.
      
      One caveat is that my_2_3.cnf needs something under mysqld.2.3 group,
      otherwise mtr will fail with something like:
      
      There is no group named 'mysqld.2.3' that can be used to resolve
      'port' for ...
      
      This will allow new tests under the spider suite to use what is
      needed. It also somehow fixes issues of running a test followed by
      spider.slave_trx_isolation.
      01289dac
    • Yuchen Pei's avatar
    • Dmitry Shulga's avatar
      MDEV-34171: Memory leakage is detected on running the test versioning.partition · 77c465d5
      Dmitry Shulga authored
      One of possible use cases that reproduces the memory leakage listed below:
      
        set timestamp= unix_timestamp('2000-01-01 00:00:00');
        create or replace table t1 (x int) with system versioning
          partition by system_time interval 1 hour auto
          partitions 3;
      
        create table t2 (x int);
      
        create trigger tr after insert on t2 for each row update t1 set x= 11;
        create or replace procedure sp2() insert into t2 values (5);
      
        set timestamp= unix_timestamp('2000-01-01 04:00:00');
        call sp2;
      
        set timestamp= unix_timestamp('2000-01-01 13:00:00');
        call sp2; # <<=== Memory leak happens there. In case MariaDB server is built
                          with the option -DWITH_PROTECT_STATEMENT_MEMROOT,
                          the second execution would hit assert failure.
      
      The reason of leaking a memory is that once a new partition be created
      the table should be closed and re-opened. It results in calling the function
      extend_table_list() that indirectly invokes the function sp_add_used_routine()
      to add routines implicitly used by the statement that makes a new memory
      allocation.
      
      To fix it, don't remove routines and tables the statement implicitly depends
      on when a table being closed for subsequent re-opening.
      77c465d5
    • Dmitry Shulga's avatar
      MDEV-24411: Trigger doesn't work correctly with bulk insert · 8b169949
      Dmitry Shulga authored
      Executing an INSERT statement in PS mode having positional parameter
      bound with an array could result in incorrect number of inserted rows
      in case there is a BEFORE INSERT trigger that executes yet another
      INSERT statement to put a copy of row being inserted into some table.
      
      The reason for incorrect number of inserted rows is that a data structure
      used for binding positional argument with its actual values is stored
      in THD (this is thd->bulk_param) and reused on processing every INSERT
      statement. It leads to consuming actual values bound with top-level
      INSERT statement by other INSERT statements used by triggers' body.
      
      To fix the issue, reset the thd->bulk_param temporary to the value nullptr
      before invoking triggers and restore its value on finishing its execution.
      8b169949
  4. 24 Jun, 2024 4 commits
  5. 22 Jun, 2024 3 commits
  6. 21 Jun, 2024 2 commits
  7. 20 Jun, 2024 6 commits
    • Dave Gosselin's avatar
      MDEV-33746 Supply missing override markings · db0c28ef
      Dave Gosselin authored
      Find and fix missing virtual override markings.  Updates cmake
      maintainer flags to include -Wsuggest-override and
      -Winconsistent-missing-override.
      db0c28ef
    • Vlad Lesin's avatar
      MDEV-34108 Inappropriate semi-consistent read in RC if innodb_snapshot_isolation=ON · 0a199cb8
      Vlad Lesin authored
      The fixes in b8a67198 have not disabled
      semi-consistent read for innodb_snapshot_isolation=ON mode, they just allowed
      read uncommitted version of a record, that's why the test for MDEV-26643 worked
      well.
      
      The semi-consistent read should be disabled on upper level in
      row_search_mvcc() for READ COMMITTED isolation level.
      
      Reviewed by Marko Mäkelä.
      0a199cb8
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-34389 Avoid log overwrite in early recovery · ab448d4b
      Thirunarayanan Balathandayuthapani authored
      - InnoDB tries to write FILE_CHECKPOINT marker during
      early recovery when log file size is insufficient.
      While updating the log checkpoint at the end of the recovery,
      InnoDB must already have written out all pending changes
      to the persistent files. To complete the checkpoint, InnoDB
      has to write some log records for the checkpoint and to
      update the checkpoint header. If the server gets killed
      before updating the checkpoint header then it would lead
      the logfile to be unrecoverable.
      
      - This patch avoids FILE_CHECKPOINT marker during early
      recovery and narrows down the window of opportunity to
      make the log file unrecoverable.
      ab448d4b
    • Alexander Barkov's avatar
      MDEV-34417 Wrong result set with utf8mb4_danish_ci and BNLH join · 6cecf61a
      Alexander Barkov authored
      There were erroneous calls for charpos() in key_hashnr() and key_buf_cmp().
      These functions are never called with prefix segments.
      
      The charpos() calls were wrong. Before the change BNHL joins
      - could return wrong result sets, as reported in MDEV-34417
      - were extremely slow for multi-byte character sets, because
        the hash was calculated on string prefixes, which increased
        the amount of collisions drastically.
      
      This patch fixes the wrong result set as reported in MDEV-34417,
      as well as (partially) the performance problem reported in MDEV-34352.
      6cecf61a
    • Monty's avatar
      Disable new connections in case of fatal signal · 279aa1e6
      Monty authored
      A user reported that MariaDB server got a signal 6 but still accepted new
      connections and did not crash.  I have not been able to find a way to
      repeat this or find the cause of issue. However to make it easier to
      notice that the server is unstable, I added code to disable new
      connections when the handle_fatal_signal() handler has been called.
      279aa1e6
    • Monty's avatar
      MDEV-33582 Add more warnings to be able to better diagnose network issues · 3541bd63
      Monty authored
      Changed the logged messages from errors to warnings
      Also changed 'remain' to 'read_length' in the warning to make it more readable.
      3541bd63
  8. 19 Jun, 2024 13 commits
    • Vladislav Vaintroub's avatar
      MDEV-34428 bootstrap can't delete tempfile, it is already gone · 6c2cd4cf
      Vladislav Vaintroub authored
      The problem is seen on CI, where TEMP pointed to directory outside of
      the usual vardir, when testing mysql_install_db.exe
      A likely cause for this error is that TEMP was periodically cleaned up
      by some automation running on the host, perhaps by buildbot itself.
      
      To fix, mysql_install_db.exe will now use datadir as --tmpdir
      for the bootstrap run. This will minimize chances to run into any
      environment problems.
      6c2cd4cf
    • Vicențiu Ciorbaru's avatar
      MDEV-34311: Alter USER should reset all account limit counters · 63823391
      Vicențiu Ciorbaru authored
      This commit introduces a reset of password errors counter on any alter user
      command for the altered user. This is done so as to not require a
      complete privilege system reload.
      63823391
    • Vicențiu Ciorbaru's avatar
      cleanup, refactor · 2d8d8139
      Vicențiu Ciorbaru authored
      Fix coding style and extract common password reset counter code into
      separate ACL_USER method.
      2d8d8139
    • Iaroslav Babanin's avatar
      MDEV-33935 fix deadlock counter · 5d49a2ad
      Iaroslav Babanin authored
      - The deadlock counter was moved from
      Deadlock::find_cycle into Deadlock::report, because
      the find_cycle method is called multiple times during deadlock
      detection flow, which means it shouldn't have such side effects.
      But report() can, which called only once for
      a victim transaction.
      - Also the deadlock_detect.test and *.result test case
      has been extended to handle the fix.
      5d49a2ad
    • Jan Lindström's avatar
      MDEV-31658 : Deadlock found when trying to get lock during applying · ee974ca5
      Jan Lindström authored
      Problem was that there was two non-conflicting local idle
      transactions in node_1 that both inserted a key to primary key.
      Then two transactions from other nodes inserted also
      a key to primary key so that insert from node_2 conflicted
      one of the local transactions in node_1 so that there would
      be duplicate key if both are committed. For this insert
      from other node tries to acquire S-lock for this record
      and because this insert is high priority brute force (BF)
      transaction it will kill idle local transaction.
      
      Concurrently, second insert from node_3 conflicts the second
      idle insert transaction in node_1. Again, it tries to acquire
      S-lock for this record and kills idle local transaction.
      
      At this point we have two non-conflicting high priority
      transactions holding S-lock on different records in node_1.
      For example like this: rec s-lock-node2-rec s-lock-node3-rec rec.
      
      Because these high priority BF-transactions do not wait
      each other insert from node3 that has later seqno compared
      to insert from node2 can continue. It will try to acquire
      insert intention for record it tries to insert (to avoid
      duplicate key to be inserted by local transaction). Hower,
      it will note that there is conflicting S-lock in same gap
      between records. This will lead deadlock error as we have
      defined that BF-transactions may not wait for record lock
      but we can't kill conflicting BF-transaction because
      it has lower seqno and it should commit first.
      
      BF-transactions are executed concurrently because their
      values to primary key are different i.e. they do not
      conflict.
      
      Galera certification will make sure that inserts from
      other nodes i.e these high priority BF-transactions
      can't insert duplicate keys. Local transactions naturally
      can but they will be killed when BF-transaction
      acquires required record locks.
      
      Therefore, we can allow situation where there is conflicting
      S-lock and insert intention lock regardless of their seqno
      order and let both continue with no wait. This will lead
      to situation where we need to allow BF-transaction
      to wait when lock_rec_has_to_wait_in_queue is called
      because this function is also called from
      lock_rec_queue_validate and because lock is waiting
      there would be assertion in ut_a(lock->is_gap()
      || lock_rec_has_to_wait_in_queue(cell, lock));
      
      lock_wait_wsrep_kill
        Add debug sync points for BF-transactions killing
        local transaction.
      
      wsrep_assert_no_bf_bf_wait
        Print also requested lock information
      
      lock_rec_has_to_wait
        Add function to handle wsrep transaction lock wait
        cases.
      
      lock_rec_has_to_wait_wsrep
        New function to handle wsrep transaction lock wait
        exceptions.
      
      lock_rec_has_to_wait_in_queue
        Remove wsrep exception, in this function all
        conflicting locks need to wait in queue.
        Conflicts between BF and local transactions
        are handled in lock_wait.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      ee974ca5
    • Julius Goryavsky's avatar
    • Jan Lindström's avatar
      MDEV-12008 : Change error code for Galera unkillable threads · 1001dae1
      Jan Lindström authored
      Changed error code for Galera unkillable threads to
      be ER_KILL_DENIED_HIGH_PRIORITY giving message
      
      This is a high priority thread/query and cannot be killed
      without the compromising consistency of the cluster
      
      also a warning is produced
        Thread %lld is [wsrep applier|high priority] and cannot be killed
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      1001dae1
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.11 · 34813c1a
      Marko Mäkelä authored
      34813c1a
    • Andrei's avatar
      MDEV-29934 rpl.rpl_start_alter_chain_basic, rpl.rpl_start_alter_restart_slave... · 387bdb2a
      Andrei authored
      MDEV-29934 rpl.rpl_start_alter_chain_basic, rpl.rpl_start_alter_restart_slave sometimes fail in BB with result content mismatch
      
      rpl.rpl_start_alter_chain_basic was used to fail sporadically due
      to a missed GTID master-slave synchronization which was necessary
      because of the following SELECT from GTID-state table.
      
      Fixed with arranging two synchronization pieces for two
      chain slaves requiring that.
      
      Note rpl.rpl_start_alter_restart_slave must have been fixed by
      MDEV-30460 and 87e13722 (manual) merge commit.
      387bdb2a
    • Marko Mäkelä's avatar
      MDEV-34178: Enable spinloop for index_lock · 5b26a076
      Marko Mäkelä authored
      In an I/O bound concurrent INSERT test conducted by Mark Callaghan,
      spin loops on dict_index_t::lock turn out to be beneficial.
      
      This is a mixed bag; enabling the spin loops will improve throughput
      and latency on some workloads and degrade in others.
      
      Reviewed by: Debarun Banerjee
      Tested by: Matthias Leich
      Performance tested by: Axel Schwenke
      5b26a076
    • Marko Mäkelä's avatar
      MDEV-34178: Improve the spin loops · f8d213bd
      Marko Mäkelä authored
      srw_mutex_impl<spinloop>::wait_and_lock(): Invoke srw_pause() and
      reload the lock word on each loop. Thanks to Mark Callaghan for
      suggesting this.
      
      ssux_lock_impl<spinloop>::rd_wait(): Actually implement a spin loop
      on the rw-lock component without blocking on the mutex component.
      If there is a conflict with wr_lock(), wait for writer.lock to be
      released without actually acquiring it.
      
      Reviewed by: Debarun Banerjee
      Tested by: Matthias Leich
      f8d213bd
    • Marko Mäkelä's avatar
      MDEV-34178: Improve PERFORMANCE_SCHEMA instrumentation · 6cde03ae
      Marko Mäkelä authored
      When MariaDB is built with PERFORMANCE_SCHEMA support enabled
      and with futex-based rw-locks (not srw_lock_), we were unnecessarily
      releasing and reacquiring lock.writer in srw_lock_impl::psi_wr_lock()
      and ssux_lock::psi_wr_lock().
      
      If there is a conflict with rd_lock(), let us hold the lock.writer
      and execute u_wr_upgrade() to wait for rd_unlock().
      
      Reviewed by: Debarun Banerjee
      Tested by: Matthias Leich
      6cde03ae
    • Alexander Barkov's avatar
      MDEV-27966 Assertion `fixed()' failed and Assertion `fixed == 1' failed, both... · cfa61434
      Alexander Barkov authored
      MDEV-27966 Assertion `fixed()' failed and Assertion `fixed == 1' failed, both in Item_func_concat::val_str on SELECT after INSERT with collation utf32_bin on utf8_bin table
      
      This problem was earlier fixed by this commit:
      
      > commit 08c7ab40
      > Author: Aleksey Midenkov <midenok@gmail.com>
      > Date:   Mon Apr 18 12:44:27 2022 +0300
      >
      >    MDEV-24176 Server crashes after insert in the table with virtual
      >    column generated using date_format() and if()
      
      Adding an mtr test only.
      cfa61434
  9. 18 Jun, 2024 2 commits
    • Marko Mäkelä's avatar
      MDEV-34178: Simplify the U lock · 2bd661ca
      Marko Mäkelä authored
      The U lock mode of the sux_lock that was introduced in
      commit 03ca6495 (MDEV-24142)
      is unnecessarily complex.
      
      Internally, sux_lock comprises two parts, each with their own wait queue
      inside the operating system kernel: a mutex and a rw-lock.
      
      We can map the operations as follows:
      
      x_lock(): (X,X)
      u_lock(): (X,_)
      s_lock(): (_,S)
      
      The Update lock mode, which is mutually exclusive with itself and with
      X (exclusive) locks but not with shared (S) locks, was unnecessarily
      acquiring a shared lock on the second component. The mutual exclusion
      is guaranteed by the first component.
      
      We might simplify the #ifdef SUX_LOCK_GENERIC case further by omitting
      srw_mutex_impl::lock, because it is kind-of duplicating the mutex
      that we will use for having a wait queue. However, the predicate
      buf_page_t::can_relocate() would depend on the predicate
      is_locked_or_waiting(), which is not available for pthread_mutex_t.
      
      Reviewed by: Debarun Banerjee
      Tested by: Matthias Leich
      2bd661ca
    • Brandon Nesterenko's avatar
      MDEV-23857: replication master password length · 6cab2f75
      Brandon Nesterenko authored
      After MDEV-4013, the maximum length of replication passwords was extended to
      96 ASCII characters. After a restart, however, slaves only read the first 41
      characters of MASTER_PASSWORD from the master.info file. This lead to slaves
      unable to reconnect to the master after a restart.
      
      After a slave restart, if a master.info file is detected, use the full
      allowable length of the password rather than 41 characters.
      
      Reviewed By:
      ============
      Sergei Golubchik <serg@mariadb.com>
      6cab2f75