1. 18 Sep, 2023 1 commit
  2. 15 Sep, 2023 9 commits
    • Yuchen Pei's avatar
      Merge branch '10.5' into 10.6 · 0f870914
      Yuchen Pei authored
      0f870914
    • Yuchen Pei's avatar
      Merge branch '10.4' into 10.5 · cf816263
      Yuchen Pei authored
      cf816263
    • Yuchen Pei's avatar
      MDEV-32157 MDEV-28856 Spider: Tests, documentation, small fixes and cleanups · 18990f00
      Yuchen Pei authored
      Removed some redundant hint related string literals from
      spd_db_conn.cc
      
      Clean up SPIDER_PARAM_*_[CHAR]LEN[S]
      
      Adding tests covering monitoring_kind=2. What it does is that it reads
      from mysql.spider_link_mon_servers with matching db_name, table_name,
      link_id, and does not do anything about that...
      
      How monitoring_* can be useful: in the deprecated spider high
      availability feature, when one remote fails, spider will try another
      remote, which apparently makes use of these table parameters.
      
      A test covering the query_cache_sync table param. Some further tests
      on some spider table params.
      
      Wrapper should be case insensitive.
      
      Code documentation on spider priority binary tree.
      
      Add an assertion that static_key_cardinality is always -1. All tests
      pass still
      18990f00
    • Yuchen Pei's avatar
      MDEV-32157 MDEV-28856 Spider: drop server in tests · 3b3200e2
      Yuchen Pei authored
      This helps eliminate "server exists" failures
      
      Also, spider/bugfix.mdev_29676, when enabled after MDEV-29525 is
      pushed will fail because we have not --recorded the result. But the
      failure will only emerge when working on MDEV-31138 where we manually
      re-enable this test, so let's worry about that then.
      3b3200e2
    • Yuchen Pei's avatar
      Merge branch '10.5' into 10.6 · b70d8fbf
      Yuchen Pei authored
      b70d8fbf
    • Yuchen Pei's avatar
      MDEV-29502 Fix some issues with spider direct aggregate · 68a00207
      Yuchen Pei authored
      The direct aggregate mechanism sems to be only intended to work when
      otherwise a full table scan query will be executed from the spider
      node and the aggregation done at the spider node too. Typically this
      happens in sub_select(). In the test spider.direct_aggregate_part
      direct aggregate allows to send COUNT statements directly to the data
      nodes and adds up the results at the spider node, instead of iterating
      over the rows one by one at the spider node.
      
      By contrast, the group by handler (GBH) typically sends aggregated
      queries directly to data nodes, in which case DA does not improve the
      situation here.
      
      That is why we should fix it by disabling DA when GBH is used.
      
      There are other reasons supporting this change. First, the creation of
      GBH results in a call to change_to_use_tmp_fields() (as opposed to
      setup_copy_fields()) which causes the spider DA function
      spider_db_fetch_for_item_sum_funcs() to work on wrong items. Second,
      the spider DA function only calls direct_add() on the items, and the
      follow-up add() needs to be called by the sql layer code. In
      do_select(), after executing the query with the GBH, it seems that the
      required add() would not necessarily be called.
      
      Disabling DA when GBH is used does fix the bug. There are a few
      other things included in this commit to improve the situation with
      spider DA:
      
      1. Add a session variable that allows user to disable DA completely,
      this will help as a temporary measure if/when further bugs with DA
      emerge.
      
      2. Move the increment of direct_aggregate_count to the spider DA
      function. Currently this is done in rather bizarre and random
      locations.
      
      3. Fix the spider_db_mbase_row creation so that the last of its row
      field (sentinel) is NULL. The code is already doing a null check, but
      somehow the sentinel field is on an invalid address, causing the
      segfaults. With a correct implementation of the row creation, we can
      avoid such segfaults.
      68a00207
    • Yuchen Pei's avatar
      Merge branch '10.4' into 10.5 · e95e9a22
      Yuchen Pei authored
      e95e9a22
    • Yuchen Pei's avatar
      MDEV-31787 MDEV-26151 Add a test exercising non-0 spider_casual_read · 96760d3a
      Yuchen Pei authored
      Also:
      - clean up spider_check_and_get_casual_read_conn() and
        spider_check_and_set_autocommit()
      - remove a couple of commented out code blocks
      96760d3a
    • Yuchen Pei's avatar
  3. 14 Sep, 2023 10 commits
  4. 13 Sep, 2023 5 commits
    • Brandon Nesterenko's avatar
      MDEV-31177: SHOW SLAVE STATUS Last_SQL_Errno Race Condition on Errored Slave Restart · 1407f999
      Brandon Nesterenko authored
      The SQL thread and a user connection executing SHOW SLAVE STATUS
      have a race condition on Last_SQL_Errno, such that a slave which
      previously errored and stopped, on its next start, SHOW SLAVE STATUS
      can show that the SQL Thread is running while the previous error is
      also showing.
      
      The fix is to move when the last error is cleared when the SQL
      thread starts to occur before setting the status of
      Slave_SQL_Running.
      
      Thanks to Kristian Nielson for his work diagnosing the problem!
      
      Reviewed By:
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      Kristian Nielson <knielsen@knielsen-hq.org>
      1407f999
    • Brandon Nesterenko's avatar
      MDEV-31038: rpl.rpl_xa_prepare_gtid_fail clean up · 7de0c7b5
      Brandon Nesterenko authored
      - Removed commented out and unused lines.
      - Updated test to reference true failure of timeout
        rather than deadlock
      - Switched save variables from MTR to user
      - Forced relay-log purge to not potentially re-execute
        an already prepared transaction
      7de0c7b5
    • Daniel Black's avatar
      MDEV-31369 Disable TLS v1.0 and 1.1 for MariaDB · 1831f8e4
      Daniel Black authored
      Remove TLSv1.1 from the default tls_version system variable.
      
      Output a warning if TLSv1.0 or TLSv1.1 are selected.
      
      Thanks Tingyao Nian for the feature request.
      1831f8e4
    • Sergei Golubchik's avatar
      post-merge fix · 9e9cefde
      Sergei Golubchik authored
      9e9cefde
    • Oleg Smirnov's avatar
      MDEV-31315 Add client_ed25519.dll to the list of plugins shipped with HeidiSQL · 5fe8d0d5
      Oleg Smirnov authored
      There is a list of plugins in the WiX configuration file for HeidiSQL,
      and the installer only installs DLLs from that list although the HeidiSQL
      portable archive may include other plugins.
      
      This commit adds client_ed25519.dll to this list and also rearranges
      the list alphabetically, so it is easier to verify its contents
      5fe8d0d5
  5. 12 Sep, 2023 4 commits
    • Marko Mäkelä's avatar
      MDEV-32150 InnoDB reports corruption on 32-bit platforms with ibd files sizes > 4GB · d20a4da2
      Marko Mäkelä authored
      buf_read_page_low(): Use 64-bit arithmetics when computing the
      file byte offset. In other calls to fil_space_t::io() the offset
      was being computed correctly, for example by
      buf_page_t::physical_offset().
      d20a4da2
    • Marko Mäkelä's avatar
      MDEV-30100 fixup: Remove a failing debug assertion · 736901b4
      Marko Mäkelä authored
      trx_purge_truncate_history(): Remove a debug assertion that
      had originally been added in
      commit 0de3be8c (MDEV-30671).
      In trx_t::commit_empty() we do not have any efficient way to rewind
      rseg.needs_purge to an accurate value that would satisfy this
      debug assertion.
      
      Note: No correctness property should be violated here. At the point
      where the debug assertion was located, we had already established
      that purge_sys.sees(rseg.needs_purge) holds, that is, it is safe
      to remove everything from rseg.
      736901b4
    • Marko Mäkelä's avatar
      MDEV-26782 fixup: Remove dead code · 3c840ae7
      Marko Mäkelä authored
      trx_undo_reuse_cached(): Assert that this is being invoked on the
      persistent rollback segment of the transaction, and remove dead code
      that was handling cached temporary undo log. This was missed in
      commit 51e62cb3 (MDEV-26782).
      3c840ae7
    • sjaakola's avatar
      MDEV-31833 replication breaks when using optimistic replication and replica is a galera node · a3cbc44b
      sjaakola authored
      MariaDB async replication SQL thread was stopped for any failure
      in applying of replication events and error message logged for the failure
      was: "Node has dropped from cluster". The assumption was that event applying
      failure is always due to node dropping out.
      With optimistic parallel replication, event applying can fail for natural
      reasons and applying should be retried to handle the failure. This retry
      logic was never exercised because the slave SQL thread was stopped with first
      applying failure.
      
      To support optimistic parallel replication retrying logic this commit will
      now skip replication slave abort, if node remains in cluster (wsrep_ready==ON)
      and replication is configured for optimistic or aggressive retry logic.
      
      During the development of this fix, galera.galera_as_slave_nonprim test showed
      some problems. The test was analyzed, and it appears to need some attention.
      One excessive sleep command was removed in this commit, but it will need more
      fixes still to be fully deterministic. After this commit galera_as_slave_nonprim
      is successful, though.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      a3cbc44b
  6. 11 Sep, 2023 11 commits
    • Julius Goryavsky's avatar
      galera: wsrep-lib sumbodule update · 1adfdfbd
      Julius Goryavsky authored
      1adfdfbd
    • Daniele Sciascia's avatar
      MDEV-32051 Failed to insert streaming client · ef4b59fa
      Daniele Sciascia authored
      - Deterministic test to reproduce the warning
      - Update wsrep-lib to fix the issue
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      ef4b59fa
    • Jan Lindström's avatar
      MDEV-31988 : galera_partition test: assertion due to unallowed state transition · fee138a1
      Jan Lindström authored
      Test case is starting too many servers that are not really
      needed for original problem testing. This fix reduces
      number of servers to make test case smaller and more
      robust.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      fee138a1
    • Jan Lindström's avatar
      MDEV-29861 : Galera "notify" test cases hang · 632a503c
      Jan Lindström authored
      Problem was that if wsrep_notify_cmd was set it was called
      with a new status "joined" it tries to connect to the server
      to update some table, but the server isn't initialized yet,
      it's not listening for connections. So the server waits for the
      script to finish, script waits for mariadb client to connect,
      and the client cannot connect, because the server isn't listening.
      
      Fix is to call script only when Galera has already formed a
      view or when it is synched or donor.
      
      This fix also enables following test cases:
      * galera.MW-284
      * galera.galera_binlog_checksum
      * galera_var_notify_ssl_ipv6
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      632a503c
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-32145 Disable read-ahead for temporary tablespace · a03b8cd0
      Thirunarayanan Balathandayuthapani authored
      - Lifetime of temporary tables is expected to be short, it would
      seem to make sense to assume that all temporary tablespace pages
      will remain in the buffer pool. It doesn't make sense to have
      read-ahead for pages of temporary tablespace
      a03b8cd0
    • Marko Mäkelä's avatar
      MDEV-32134 InnoDB hang in buf_flush_wait_LRU_batch_end() · cdd2fa7f
      Marko Mäkelä authored
      buf_flush_page_cleaner(): Before finishing a batch, wake up any threads
      that are waiting for buf_pool.done_flush_LRU.
      
      This should fix a hung shutdown that we observed
      after SET GLOBAL innodb_buffer_pool_size started was executed
      to shrink the InnoDB buffer pool.
      cdd2fa7f
    • Marko Mäkelä's avatar
      MDEV-32103 InnoDB ALTER TABLE is not crash-safe · 466d9f5f
      Marko Mäkelä authored
      Starting with commit 4ff5311d
      log_write_up_to(trx->commit_lsn, true) in DDL operations could end up
      being a no-op, because trx->commit_lsn would be 0.
      
      trx_flush_log_if_needed(): Revert an incorrect attempt to ensure
      that DDL operations are crash-safe.
      
      trx_t::commit(std::vector<pfs_os_file_t> &), ha_innobase::rename_table():
      Set trx_t::flush_log_later so that trx_t::commit_in_memory() will
      retain trx_t::commit_lsn for the final durability call.
      
      Tested by: Matthias Leich
      466d9f5f
    • Marko Mäkelä's avatar
      MDEV-30531 Corrupt index(es) on busy table when using FOREIGN KEY · 4a8291fc
      Marko Mäkelä authored
      lock_wait(): Never return the transient error code DB_LOCK_WAIT.
      In commit 78a04a4c (MDEV-29869)
      some assignments assign trx->error_state = DB_SUCCESS were removed,
      and it was possible that the field was left at its initial value
      DB_LOCK_WAIT.
      
      The test case for this is nondeterministic; without this fix, it
      would only occasionally fail.
      
      Reviewed by: Vladislav Lesin
      4a8291fc
    • Marko Mäkelä's avatar
      MDEV-32096 Parallel replication lags because innobase_kill_query() may fail to... · e039720b
      Marko Mäkelä authored
      MDEV-32096 Parallel replication lags because innobase_kill_query() may fail to interrupt a lock wait
      
      lock_sys_t::cancel(trx_t*): Remove, and merge to its only caller
      innobase_kill_query().
      
      innobase_kill_query(): Before reading trx->lock.wait_lock,
      do acquire lock_sys.wait_mutex, like we did before
      commit e71e6133 (MDEV-24671).
      In this way, we should not miss a recently started lock wait
      by the killee transaction.
      
      lock_rec_lock(): Add a DEBUG_SYNC "lock_rec" for the test case.
      
      lock_wait(): Invoke trx_is_interrupted() before entering the wait,
      in case innobase_kill_query() was invoked some time earlier and
      some longer-running operation did not check for interrupts.
      As suggested by Vladislav Lesin, do not overwrite
      trx->error_state==DB_INTERRUPTED with DB_SUCCESS.
      This would avoid a call to trx_is_interrupted() when the test is
      modified to use the DEBUG_SYNC point lock_wait_start instead of lock_rec.
      Avoid some redundant loads of trx->lock.wait_lock; cache the value
      in the local variable wait_lock.
      
      Deadlock::check_and_resolve(): Take wait_lock as a parameter and
      return wait_lock (or -1 or nullptr). We only need to reload
      trx->lock.wait_lock if lock_sys.wait_mutex had been released
      and reacquired.
      
      trx_t::error_state: Correctly document the data member.
      
      trx_lock_t::was_chosen_as_deadlock_victim: Clarify that other threads
      may set the field (or flags in it) while holding lock_sys.wait_mutex.
      
      Thanks to Johannes Baumgarten for reporting the problem and testing
      the fix, as well as to Kristian Nielsen for suggesting the fix.
      
      Reviewed by: Vladislav Lesin
      Tested by: Matthias Leich
      e039720b
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 0dd25f28
      Marko Mäkelä authored
      0dd25f28
    • Marko Mäkelä's avatar
      MDEV-21679 fixup for s390x · ef569c32
      Marko Mäkelä authored
      Some s390x environments include
      https://github.com/madler/zlib/pull/410
      and a more pessimistic compressBound: (sourceLen * 16 + 2308) / 8 + 6.
      Let us adjust the recently enabled tests accordingly.
      ef569c32