1. 27 Feb, 2024 2 commits
  2. 26 Feb, 2024 1 commit
  3. 20 Feb, 2024 1 commit
  4. 16 Feb, 2024 4 commits
    • Xiaotong Niu's avatar
      MDEV-28430: Fix memory barrier missing of lf_alloc on Arm64 · 8a505980
      Xiaotong Niu authored
      When testing MariaDB on Arm64, a stall issue will occur, jira link:
      https://jira.mariadb.org/browse/MDEV-28430.
      
      The stall occurs because of an unexpected circular reference in the
      LF_PINS->purgatory list which is traversed in lf_pinbox_real_free().
      
      We found that on Arm64, ABA problem in LF_ALLOCATOR->top list was not
      solved, and various undefined problems will occur, including circular
      reference in LF_PINS->purgatory list.
      
      The following codes are used to solve ABA problem, code copied
      from below link.
      https://github.com/MariaDB/server/blob/cb4c2713553c5f522d2a4ebf186c6505384c748d/mysys/lf_alloc-pin.c#L501-#L505
      
           do
           {
      503     node= allocator->top;
      504     lf_pin(pins, 0, node);
      505  } while (node != allocator->top && LF_BACKOFF());
      
      1. ABA problem on Arm64
      Combine the below steps to analyze how ABA problem occur on Arm64, the
      relevant codes in steps are simplified, code line numbers below are in
      MariaDB v10.4.
      ------------------------------------------------------------------------
      Abnormal case.
      Initial state: pin = 0, top = A, top list: A->B
      
      T1                              T2
                                      step1. write top=B //seq-cst, #L517
                                      step2. write A->next= "any"
                                      step3. read pin==0 //relaxed, #L295
      step1. write pin=A  //seq-cst, #L504
      step2. read old value of top==A  //relaxed, #L505
      step3. next=A->next="any" //#L517
                                      step4. write A->next=B,top=A //#L420-435
      step4. CAS(top,A,next) //#L517
      step5. write pin=0     //#L521
      ------------------------------------------------------------------------
      Above case is due to T1.step2 reading the old value of top, causing
      "T1.step3, T1.step4" and "T2.step4" to occur at the same time, in other
      words, they are not mutually exclusive.
      
      It may happen that T2.step4 is sandwiched between T1.step3 and T1.step4,
      which cause top to be updated to "any", which may be in-use or invalid
      address.
      
      2. Analyze above issue with Dekker's algorithm
      Above problem can be mapped to Dekker's algorithm, link is as below
      https://en.wikipedia.org/wiki/Dekker%27s_algorithm.
      The following extracts the read and write operations on 'top' and 'pin',
      and maps them to Dekker's algorithm to analyze the root cause.
      ------------------------------------------------------------------------
      Initial state: top = A, pin = 0
      T1                                    T2
      store_seq_cst(pin, A) // write pin    store_seq_cst(top, B)  //write top
      rt= load_relaxed(top) // read top     rp= load_relaxed(pin)  //read pin
      
      if (rt == A && rp == 0) printf("oops\n"); // will "oops" be printed?
      ------------------------------------------------------------------------
      How T1 and T2 enter their critical section:
      (1) T1, write pin, if T1 reads that top has not been updated, T1 enter
      its critical section(T1.step3 and T1.step4, try to obtain 'A', #L517),
      otherwise just give up (T1 without priority).
      (2) T2, write top, if T2 reads that pin has not been updated, T2 enter
      critical section(T2.step4, try to add 'A' to top list again, #L420-435),
      otherwise wait until pin!=A (T2 with priority).
      
      In the previous code, due to load 'top' and 'pin' with relaxed semantic,
      on arm and ppc, there is no guarantee that the above critical sections
      are mutually exclusive, in other words, "oops" will be printed.
      
      This bug only happens on arm and ppc, not x86. On current x86
      implementation, load is always seq-cst (relaxed and seq-cst load
      generates same machine code), as shown in https://godbolt.org/z/sEzMvnjd9
      
      3. Fix method
      Add sequential-consistency semantic to read 'top' in #L505(T1.step2),
      Add sequential-consistency semantic to read "el->pin[i]" in #L295
      and #L320.
      
      4. Issue reproduce
      Add "delay" after #L503 in lf_alloc-pin.c, When run unit.lf, can quickly
      get segment fault because "top" point to an invalid address. For detail,
      see comment area of below link.
      https://jira.mariadb.org/browse/MDEV-28430.
      
      5. Futher improvement
      To make this code more robust and safe on all platforms, we recommend
      replacing volatile with C11 atomics and to fix all data races. This will
      also make the code easier to reason.
      Signed-off-by: default avatarXiaotong Niu <xiaotong.niu@arm.com>
      8a505980
    • Kristian Nielsen's avatar
      MDEV-33468: Crash due to missing stack overrun check in two recursive functions · 5707f1ef
      Kristian Nielsen authored
      Thanks to Yury Chaikou for finding this problem (and the fix).
      Reviewed-by: default avatarMonty <monty@mariadb.org>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      5707f1ef
    • Kristian Nielsen's avatar
      MDEV-33443: Unsafe use of LOCK_thd_kill in my_malloc_size_cb_func() · fdaa7a96
      Kristian Nielsen authored
      my_malloc_size_cb_func() can be called from contexts where it is not safe to
      wait for LOCK_thd_kill, for example while holding LOCK_plugin. This could
      lead to (probably very unlikely) deadlock of the server.
      
      Fix by skipping the enforcement of --max-session-mem-used in the rare cases
      when LOCK_thd_kill cannot be obtained. The limit will instead be enforced on
      the following memory allocation. This does not significantly degrade the
      behaviour of --max-session-mem-used; that limit is in any case only enforced
      "softly", not taking effect until the next point at which the thread does a
      check_killed().
      Reviewed-by: default avatarMonty <monty@mariadb.org>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      fdaa7a96
    • Kristian Nielsen's avatar
      MDEV-33426: Aria temptables wrong thread-specific memory accounting in slave thread · c73c6aea
      Kristian Nielsen authored
      Aria temporary tables account allocated memory as specific to the current
      THD. But this fails for slave threads, where the temporary tables need to be
      detached from any specific THD.
      
      Introduce a new flag to mark temporary tables in replication as "global",
      and use that inside Aria to not account memory allocations as thread
      specific for such tables.
      
      Based on original suggestion by Monty.
      Reviewed-by: default avatarMonty <monty@mariadb.org>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      c73c6aea
  5. 13 Feb, 2024 1 commit
  6. 12 Feb, 2024 3 commits
    • Marko Mäkelä's avatar
      MDEV-30528 CREATE FULLTEXT INDEX assertion failure WITH SYSTEM VERSIONING · ca88eac8
      Marko Mäkelä authored
      ha_innobase::check_if_supported_inplace_alter(): Require ALGORITHM=COPY
      when creating a FULLTEXT INDEX on a versioned table.
      
      row_merge_buf_add(), row_merge_read_clustered_index(): Remove the parameter
      or local variable history_fts that had been added in the attempt to fix
      MDEV-25004.
      
      Reviewed by: Thirunarayanan Balathandayuthapani
      Tested by: Matthias Leich
      ca88eac8
    • Yuchen Pei's avatar
      MDEV-33441 Do not deinit plugin variables when retry requested · c37216de
      Yuchen Pei authored
      After MDEV-31400, plugins are allowed to ask for retries when failing
      initialisation. However, such failures also cause plugin system
      variables to be deleted (plugin_variables_deinit()) before retrying
      and are not re-added during retry.
      
      We fix this by checking that if the plugin has requested a retry the
      variables are not deleted. Because plugin_deinitialize() also calls
      plugin_variables_deinit(), if the retry fails, the variables will
      still be deleted.
      
      Alternatives considered:
      
      - remove the plugin_variables_deinit() from plugin_initialize() error
      handling altogether. We decide to take a more conservative approach
      here.
      
      - re-add the system variables during retry. It is more complicated
      than simply iterating over plugin->system_vars and call
      my_hash_insert(). For example we will need to assign values to
      the test_load field and extract more code from test_plugin_options(),
      if that is possible.
      c37216de
    • Oleksandr Byelkin's avatar
  7. 11 Feb, 2024 1 commit
  8. 09 Feb, 2024 1 commit
  9. 08 Feb, 2024 4 commits
    • Dmitry Shulga's avatar
      MDEV-15703: Crash in EXECUTE IMMEDIATE 'CREATE OR REPLACE TABLE t1 (a INT DEFAULT ?)' USING DEFAULT · e48bd474
      Dmitry Shulga authored
      This patch fixes the issue with passing the DEFAULT or IGNORE values to
      positional parameters for some kind of SQL statements to be executed
      as prepared statements.
      
      The main idea of the patch is to associate an actual value being passed
      by the USING clause with the positional parameter represented by
      the Item_param class. Such association must be performed on execution of
      UPDATE statement in PS/SP mode. Other corner cases that results in
      server crash is on handling CREATE TABLE when positional parameter
      placed after the DEFAULT clause or CALL statement and passing either
      the value DEFAULT or IGNORE as an actual value for the positional parameter.
      This case is fixed by checking whether an error is set in diagnostics
      area at the function pack_vcols() on return from the function pack_expression()
      e48bd474
    • Dmitry Shulga's avatar
      MDEV-15703: Crash in EXECUTE IMMEDIATE 'CREATE OR REPLACE TABLE t1 (a INT... · 6b2cd786
      Dmitry Shulga authored
      MDEV-15703: Crash in EXECUTE IMMEDIATE 'CREATE OR REPLACE TABLE t1 (a INT DEFAULT ?)' USING DEFAULT, UBSAN runtime error: member call on null pointer of type 'struct TABLE_LIST' in Item_param::save_in_field
      
      This is the prerequisite patch to refactor the method
        Item_default_value::fix_fields.
      The former implementation of this method was extracted and placed
      into the standalone function make_default_field() and the method
      Item_default_value::tie_field(). The motivation for this modification
      is upcoming changes for core implementation of the task MDEV-15703
      since these functions will be used from several places within
      the source code.
      6b2cd786
    • Marko Mäkelä's avatar
      MDEV-33400 Adaptive hash index corruption after DISCARD TABLESPACE · 85db5347
      Marko Mäkelä authored
      row_discard_tablespace(): Do not invoke dict_index_t::clear_instant_alter()
      because that would corrupt any adaptive hash index entries in the table.
      
      row_import_for_mysql(): Invoke dict_index_t::clear_instant_alter()
      after detaching any adaptive hash index entries.
      85db5347
    • Daniel Bartholomew's avatar
      bump the VERSION · 23101304
      Daniel Bartholomew authored
      23101304
  10. 06 Feb, 2024 1 commit
  11. 05 Feb, 2024 1 commit
    • Otto Kekäläinen's avatar
      Fix commit 179424db: No test file or result files should be executable · 3812e1c9
      Otto Kekäläinen authored
      In commit 179424db the file lowercase_table2.result was made executable
      for no known reason, most likely just a mistake. Test result files
      definitely should not be executable.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services, Inc.
      3812e1c9
  12. 01 Feb, 2024 1 commit
  13. 31 Jan, 2024 3 commits
  14. 30 Jan, 2024 6 commits
  15. 29 Jan, 2024 9 commits
  16. 26 Jan, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-27850: rpl_seconds_behind_master_spike debug_sync fix · 112eb14f
      Brandon Nesterenko authored
      A debug_sync signal could remain for the SQL thread that should have begun
      a wait_for upon seeing a GTID event, but would instead see the old signal
      and continue on without waiting. This broke an "idle" condition in
      SHOW SLAVE STATUS
      which should have automatically negated Seconds_Behind_Master. Instead,
      because the SQL thread had already processed the GTID event, it set
      sql_thread_caught_up to false, and thereby calculated the value of
      Seconds_behind_master, when the test expected 0.
      
      This patch fixes this by resetting the debug_sync state before creating a
      new transaction which sends a GTID event to the replica
      112eb14f