1. 14 Aug, 2024 1 commit
    • Marko Mäkelä's avatar
      MDEV-34678 pthread_mutex_init() without pthread_mutex_destroy() · 4f8803c0
      Marko Mäkelä authored
      When SUX_LOCK_GENERIC is defined, the srw_mutex, srw_lock, sux_lock are
      implemented based on pthread_mutex_t and pthread_cond_t.  This is the
      only option for systems that lack a futex-like system call.
      
      In the SUX_LOCK_GENERIC mode, if pthread_mutex_init() is allocating
      some resources that need to be freed by pthread_mutex_destroy(),
      a memory leak could occur when we are repeatedly invoking
      pthread_mutex_init() without a pthread_mutex_destroy() in between.
      
      pthread_mutex_wrapper::initialized: A debug field to track whether
      pthread_mutex_init() has been invoked.  This also helps find bugs
      like the one that was fixed by
      commit 1c8af2ae (MDEV-34422);
      one simply needs to add -DSUX_LOCK_GENERIC to the CMAKE_CXX_FLAGS
      to catch that particular bug on the initial server bootstrap.
      
      buf_block_init(), buf_page_init_for_read(): Invoke block_lock::init()
      because buf_page_t::init() will no longer do that.
      
      buf_page_t::init(): Instead of invoking lock.init(), assert that it
      has already been invoked (the lock is vacant).
      
      add_fts_index(), build_fts_hidden_table(): Explicitly invoke
      index_lock::init() in order to avoid a pthread_mutex_destroy()
      invocation on an uninitialized object.
      
      srw_lock_debug::destroy(): Invoke readers_lock.destroy().
      
      trx_sys_t::create(): Invoke trx_rseg_t::init() on all rollback segments
      in order to guarantee a deterministic state for shutdown, even if
      InnoDB fails to start up.
      
      trx_rseg_array_init(), trx_temp_rseg_create(), trx_rseg_create():
      Invoke trx_rseg_t::destroy() before trx_rseg_t::init() in order to
      balance pthread_mutex_init() and pthread_mutex_destroy() calls.
      4f8803c0
  2. 09 Aug, 2024 1 commit
  3. 08 Aug, 2024 1 commit
  4. 07 Aug, 2024 1 commit
  5. 03 Aug, 2024 2 commits
  6. 02 Aug, 2024 1 commit
  7. 31 Jul, 2024 2 commits
    • Brandon Nesterenko's avatar
      MDEV-15393: Fix rpl_mysqldump_gtid_slave_pos · 001608de
      Brandon Nesterenko authored
      The slave would try to sync_with_master_gtid.inc,
      but the master never actually saved its gtid position
      so the test would move on too quickly.
      001608de
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list · 533e6d5d
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
      - After the commit ada1074b (MDEV-14398)
      fil_crypt_set_encrypt_tables() iterates through all tablespaces to
      fill the default_encrypt tables list. This was a trigger to
      encrypt or decrypt when key rotation age is set to 0. But import
      tablespace does call fil_crypt_set_encrypt_tables() unnecessarily.
      The motivation for the call is to signal the encryption threads.
      
      Fix:
      ====
      ha_innobase::discard_or_import_tablespace: Remove the
      fil_crypt_set_encrypt_tables() and add the import tablespace
      to the default encrypt list if necessary
      533e6d5d
  8. 30 Jul, 2024 4 commits
    • Hugo Wen's avatar
      MDEV-34625 Fix undefined behavior of using uninitialized member variables · 811614d4
      Hugo Wen authored
      Commit a8a75ba2 causes the MariaDB server to crash, usually with signal
      11, at random code locations due to invalid pointer values during any
      table operation. This issue occurs when the server is built with -O3 and
      other customized compiler flags.
      
      For example, the command `use db1;` causes server to crash in the
      `check_table_access` function at line sql_parse.cc:7080 because
      `tables->correspondent_table` is an invalid pointer value of 0x1.
      
      The crashes are due to undefined behavior from using uninitialized
      variables. The problematic commit a8a75ba2 introduces code that
      allocates memory and sets it to 0 using thd->calloc before initializing
      it with a placement new operation.
      This process depends on setting memory to 0 to initialize member
      variables not explicitly set in the constructor. However, the compiler
      can optimize out the memset/bfill, leading to uninitialized values and
      unpredictable issues.
      
      Once a constructor function initializes an object, any uninitialized
      variables within that object are subject to undefined behavior. The
      state of memory before the constructor runs, whether it involves
      memset or was used for other purposes, is irrelevant after the
      placement new operation.
      
      This behavior can be demonstrated with this
      [test](https://gcc.godbolt.org/z/5n87z1raG) I wrote to examine the
      assembly code. The code in MariaDB can be abstracted to the following,
      though it has many layers wrapped around it and more complex logic,
      causing slight differences in optimization in the MariaDB build.
      To summarize, on x86, the memset in the following code is optimized out
      with both -O2 and -O3 in GCC 13, and is only preserved in the much older
      GCC 4.9.
      
          struct S {
            int i;     // uninitialized in consturctor
            S() {};
          };
          int bar() {
            void *buf = malloc(sizeof(S));
            memset(buf, 0, sizeof(S));       // optimized out
            S* s = new(buf) S;
            return s->i;
          }
      
      With GCC13 -O3:
      
          bar():
                sub     rsp, 8
                mov     edi, 4
                call    malloc
                mov     eax, DWORD PTR [rax]
                add     rsp, 8
                ret
      
      With GCC4.9 -O3
      
          bar():
                sub     rsp, 8
                mov     edi, 4
                call    malloc
                mov     DWORD PTR [rax], 0
                xor     eax, eax
                add     rsp, 8
                ret
      
      Now we ensure the constructor initializes variables correctly by running
      the reset() function in the constructor to perform the memset/bfill(0)
      operation. After applying the fix, the crash is gone.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services.
      811614d4
    • Sergei Petrunia's avatar
      MDEV-34580: Assertion `(key_part->key_part_flag & 4) == 0' failed key_hashnr · fdda8171
      Sergei Petrunia authored
      Remove an assert added by fix for MDEV-34417. BNL-H join can be used with
      prefix keys. This happens when there are real prefix indexes on the
      equi-join columns (although it probably doesn't make a lot of sense).
      
      Anyway, remove the assert. The code receives properly truncated key values
      for hashing/comparison so it can handle them just fine.
      fdda8171
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-34357 InnoDB: Assertion failure in file ./storage/innobase/page/page0zip.cc line 4211 · ee5f7692
      Thirunarayanan Balathandayuthapani authored
      During InnoDB root page split, InnoDB does the following
      1) First move the root records to the new page(p1)
      2) Empty the root, insert the node pointer to the root page
      3) Split the new page and make it as child nodes.
      4) Finds the split record, allocate another new page(p2)
      to the index
      5) InnoDB stores the record(ret) predecessor to the supremum
      record of the page (p2).
      6) In page_copy_rec_list_start(), move the records from p1 to p2
      upto the split record
      6) Given table is a compressed row format page, InnoDB attempts to
      compress the page p2 and failed (due to innodb_compression_level = 0)
      7) Since the compression fails, InnoDB gets the number of preceding
      records(ret_pos) of a record (ret) on the page (p2)
      8) Page (p2) is a new page, ret points to infimum record.
      ret_pos can be 0. InnoDB have wrong condition that ret_pos shouldn't
      be 0 and returns corruption. InnoDB has similar wrong check in
      page_copy_rec_list_end()
      ee5f7692
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-34181 Instant table aborts after discard tablespace · c038b3c0
      Thirunarayanan Balathandayuthapani authored
      - commit 85db5347 (MDEV-33400)
      retains the instantness in the table definition after discard
      tablespace. So there is no need to assign n_core_null_bytes
      during instant table preparation unless they are not
      initialized.
      c038b3c0
  9. 29 Jul, 2024 4 commits
    • Rex's avatar
      MDEV-34506 2nd execution name resolution problem with pushdown into unions · 48b256a7
      Rex authored
      Statements affected by this bug need all the following to be true
      1) a derived table table or view whose specification contains a set
           operation at the top level.
      2) a grouping operator (group by/having) operating on a column alias
           other than in the first select of the union/intersect
      3) an outer condition that will be pushed into all selects in this
           union/intersect, either into the where or having clause
      
      When pushing a condition into all selects of a unit with more than one
      select, pushdown_cond_for_derived() renames items so we can re-use the
      condition being pushed.
      These names need to be saved and reset for correct name resolution on
      second execution of prepared statements.
      
      Reviewed by Igor Babaev (igor@mariadb.com)
      48b256a7
    • Monty's avatar
      MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities · 4bf7c966
      Monty authored
      (With trivial fixes by sergey@mariadb.com)
      Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs
      
      Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int
      in InnoDB that in effect doubles the Cardinality for secondary keys.
      This has the biggest effect for indexes where a few rows has the same key
      value. Using this may also cause table scans for very small tables (which
      in some cases may be better than an index scan).
      
      The user visible effect is that 'SHOW INDEX FROM table_name' will for
      InnoDB show the true Cardinality (and not 2x the real value). It will
      also allow the optimizer to chose a better index in some cases as the
      division by 2 could have a bad effect for tables with 2-5 identical values
      per key.
      
      A few notes about using fix_innodb_cardinality:
      - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX
        will also update the statistics in table share.
      - The effect of fix_innodb_cardinality for query plans or EXPLAIN
        is only visible after first open of the table. This is why one must
        do a flush tables or use SHOW INDEX for the option to take effect.
      - Using fix_innodb_cardinality can thus affect all user in their query
        plans if they are using the same tables.
      
      Because of this, it is strongly recommended that one uses
      optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly
      in configuration files to not cause issues for other users.
      4bf7c966
    • Marko Mäkelä's avatar
      MDEV-34502 fixup: Do not cripple MSAN · 7e5c9ccd
      Marko Mäkelä authored
      We need to work around deficiencies of Valgrind, and apparently
      the previous work-around attempts
      (such as d247d649) do not work
      anymore, definitely not on recent clang-based compilers.
      
      MemorySanitizer should be fine; unfortunately we set HAVE_valgrind for it
      as well.
      7e5c9ccd
    • Marko Mäkelä's avatar
      MDEV-34565: SIGILL due to OS not supporting AVX512 · 232d7a5e
      Marko Mäkelä authored
      It is not sufficient to check that the CPU supports the necessary
      instructions. Also the operating system (or virtual machine hypervisor)
      must enable all the AVX registers to be saved and restored on a
      context switch.
      
      Because clang 8 does not support the compiler intrinsic _xgetbv()
      we will require clang 9 or later for enabling the use of VPCLMULQDQ
      and the related AVX512 features.
      232d7a5e
  10. 27 Jul, 2024 1 commit
  11. 25 Jul, 2024 1 commit
  12. 24 Jul, 2024 1 commit
  13. 23 Jul, 2024 2 commits
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-34066 Output of SHOW ENGINE INNODB STATUS uses the nanoseconds suffix for microseconds · 3359ac09
      Thirunarayanan Balathandayuthapani authored
      - This issue is caused by commit e71e6133
      (MDEV-24671). Change the output of transaction lock wait
      time in microseconds suffix.
      3359ac09
    • Oleg Smirnov's avatar
      MDEV-34634 Types mismatch when cloning items causes debug assertion · c91aeb37
      Oleg Smirnov authored
      New runtime diagnostic introduced with MDEV-34490 has detected
      that `Item_int_with_ref` incorrectly returns an instance of its ancestor
      class `Item_int`. This commit fixes that.
      
      In addition, this commit reverts a part of the diagnostic related
      to `clone_item()` checks. As it turned out, `clone_item()` is not required
      to return an object of the same class as the cloned one. For example,
      look at `Item_param::clone_item()`: it can return objects of `Item_null`,
      `Item_int`, `Item_string`, etc, depending on the object state.
      So the runtime type diagnostic is not applicable to `clone_item()` and
      is disabled with this commit.
      
      As the similar diagnostic failures are expected to appear again
      in the future, this commit introduces a new test file in the main suite:
      item_types.test, and new test cases may be added to this file
      
      Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
      c91aeb37
  14. 22 Jul, 2024 1 commit
  15. 20 Jul, 2024 1 commit
  16. 19 Jul, 2024 3 commits
    • Andrei's avatar
      MDEV-15393 gtid_slave_pos duplicate key errors after mysqldump restore · b8f92ade
      Andrei authored
      When mysqldump is run to dump the `mysql` system database, it generates
      INSERT statements into the table `mysql.gtid_slave_pos`.
      After running the backup script
      those inserts did not produce the expected gtid state on slave. In
      particular the maximum of mysql.gtid_slave_pos.sub_id did not make
      into
         rpl_global_gtid_slave_state.last_sub_id
      
      an in-memory object that is supposed to match the current state of the
      table. And that was regardless of whether --gtid option was specified
      or not. Later when the backup recipient server starts as slave
      in *non-gtid* mode this desychronization may lead to a duplicate key
      error.
      
      This effect is corrected for --gtid mode mysqldump/mariadb-dump only
      as the following.  The fixes ensure the insert block of the dump
      script is followed with a "summing-up" SET @global.gtid_slave_pos
      assignment.
      
      For the implemenation part, note a deferred print-out of
      SET-gtid_slave_pos and associated comments is prefered over relocating
      of the entire blocks if (opt_master,slave_data &&
      do_show_master,slave_status) ...  because of compatiblity
      concern. Namely an error inside do_show_*() is handled in the new code
      the same way, as early as, as before.
      
      A regression test can be run in how-to-reproduce mode as well.
      One affected mtr test observed.
      rpl_mysqldump_slave.result "mismatch" shows now the new deferring print
      of SET-gtid_slave_pos policy in action.
      b8f92ade
    • Oleksandr Byelkin's avatar
      New CC 3.3 · a94fd874
      Oleksandr Byelkin authored
      a94fd874
    • Oleksandr Byelkin's avatar
      Fix view protocol · b8b6cab2
      Oleksandr Byelkin authored
      b8b6cab2
  17. 18 Jul, 2024 2 commits
  18. 17 Jul, 2024 11 commits