1. 12 Jul, 2023 3 commits
    • Kristian Nielsen's avatar
      MDEV-31448: Killing a replica thread awaiting its GCO can hang/crash a parallel replica · a8ea6627
      Kristian Nielsen authored
      The problem was an incorrect unmark_start_commit() in
      signal_error_to_sql_driver_thread(). If an event group gets an error, this
      unmark could run after the following GCO started, and the subsequent
      re-marking could access de-allocated GCO.
      
      The offending unmark_start_commit() looks obviously incorrect, and the fix
      is to just remove it. It was introduced in the MDEV-8302 patch, the commit
      message of which suggests it was added there solely to satisfy an assertion
      in ha_rollback_trans(). So update this assertion instead to not trigger for
      event groups that experienced an error (rgi->worker_error). When an error
      occurs in an event group, all following event groups are skipped anyway, so
      the unmark should never be needed in this case.
      Reviewed-by: default avatarAndrei Elkin <andrei.elkin@mariadb.com>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      a8ea6627
    • Kristian Nielsen's avatar
      MDEV-13915: STOP SLAVE takes very long time on a busy system · 60bec1d5
      Kristian Nielsen authored
      At STOP SLAVE, worker threads will continue applying event groups until the
      end of the current GCO before stopping. This is a left-over from when only
      conservative mode was available. In optimistic and aggressive mode, often
      _all_ queued event will be in the same GCO, and slave stop will be
      needlessly delayed.
      
      This patch instead records at STOP SLAVE time the latest (highest sub_id)
      event group that has started. Then worker threads will continue to apply
      event groups up to that event group, but skip any following. The result is
      that each worker thread will complete its currently running event group, and
      then the slave will stop.
      
      If the slave is caught up, and STOP SLAVE is run in the middle of an event
      group that is already executing in a worker thread, then that event group
      will be rolled back and the slave stop immediately, as normal.
      Reviewed-by: default avatarAndrei Elkin <andrei.elkin@mariadb.com>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      60bec1d5
    • Kristian Nielsen's avatar
  2. 11 Jul, 2023 1 commit
    • Daniel Black's avatar
      MDEV-27038 Custom configuration file procedure does not work with Docker Desktop for Windows 10+ · 23d53913
      Daniel Black authored
      Docker when mounting a configuration file into a Windows exposes the
      file with permission 0777. These world writable files are ignored by
      by MariaDB.
      
      Add the access check such that filesystem RO or immutable file is
      counted as sufficient protection on the file.
      
      Test:
      $ mkdir /tmp/src
      $ vi /tmp/src/my.cnf
      $ chmod 666 /tmp/src/my.cnf
      $ mkdir /tmp/dst
      $ sudo mount --bind /tmp/src /tmp/dst -o ro
      $ ls -la /tmp/dst
      total 4
      drwxr-xr-x.  2 dan  dan   60 Jun 15 15:12 .
      drwxrwxrwt. 25 root root 660 Jun 15 15:13 ..
      -rw-rw-rw-.  1 dan  dan   10 Jun 15 15:12 my.cnf
      $ mount | grep dst
      tmpfs on /tmp/dst type tmpfs (ro,seclabel,nr_inodes=1048576,inode64)
      
      strace client/mariadb --defaults-file=/tmp/dst/my.cnf
      
      newfstatat(AT_FDCWD, "/tmp/dst/my.cnf", {st_mode=S_IFREG|0666, st_size=10, ...}, 0) = 0
      access("/tmp/dst/my.cnf", W_OK)         = -1 EROFS (Read-only file system)
      openat(AT_FDCWD, "/tmp/dst/my.cnf", O_RDONLY|O_CLOEXEC) = 3
      
      The one failing test, but this isn't a regression, just not a total fix:
      
      $ chmod u-w /tmp/src/my.cnf
      $ ls -la /tmp/src/my.cnf
      -r--rw-rw-. 1 dan dan 18 Jun 16 10:22 /tmp/src/my.cnf
      $ strace -fe trace=access client/mariadb --defaults-file=/tmp/dst/my.cnf
      access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
      access("/etc/system-fips", F_OK)        = -1 ENOENT (No such file or directory)
      access("/tmp/dst/my.cnf", W_OK)         = -1 EACCES (Permission denied)
      Warning: World-writable config file '/tmp/dst/my.cnf' is ignored
      
      Windows test (Docker Desktop ~4.21) which was the important one to fix:
      
      dan@LAPTOP-5B5P7RCK:~$ docker run --rm  -v /mnt/c/Users/danie/Desktop/conf:/etc/mysql/conf.d/:ro -e MARIADB_ROOT_PASSWORD=bob quay.io/m
      ariadb-foundation/mariadb-devel:10.4-MDEV-27038-ro-mounts-pkgtest ls -la /etc/mysql/conf.d
      total 4
      drwxrwxrwx 1 root root  512 Jun 15 13:57 .
      drwxr-xr-x 4 root root 4096 Jun 15 07:32 ..
      -rwxrwxrwx 1 root root   43 Jun 15 13:56 myapp.cnf
      
      root@a59b38b45af1:/# strace -fe trace=access mariadb
      access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
      access("/etc/mysql/conf.d/myapp.cnf", W_OK) = -1 EROFS (Read-only file system)
      23d53913
  3. 10 Jul, 2023 2 commits
    • Monty's avatar
      MDEV-20010 Equal on two RANK window functions create wrong result · 7a5c984f
      Monty authored
      The problematic query outlined a bug in window functions sorting
      optimization. When multiple window functions are present in a query,
      we sort the sorting key (as defined by PARTITION BY and ORDER BY) from
      generic to specific.
      
      SELECT RANK() OVER (ORDER BY const_col) as r1,
             RANK() OVER (ORDER BY const_col, a) as r2,
             RANK() OVER (PARTITION BY c) as r3,
             RANK() OVER (PARTITION BY c ORDER BY b) as r4
      FROM table;
      
      For these functions, the sorting we need to do for window function
      computations are: [(const_col), (const_col, a)] and [(c), (c, b)].
      
      Instead of doing 4 different sort order, the sorts grouped within [] are
      compatible and we can use the most *specific* sort to cover both window
      functions.
      
      The bug was caused by an incorrect flagging of which sort is most
      specific for a compatible group of functions. In our specific test case,
      instead of picking (const_col, a) as the most specific sort, it would
      only sort by (const_col), which lead to wrong results for rank function.
      By ensuring that we pick the last sort key before an "incompatible sort"
      flag is met in our "ordered array of sorting specifications", we
      guarantee correct results.
      7a5c984f
    • Marko Mäkelä's avatar
      MDEV-31641 innochecksum dies with Floating point exception · 12a5fb4b
      Marko Mäkelä authored
      print_summary(): Skip index_ids for which index.pages is 0.
      Tablespaces may contain some freed pages that used to refer to indexes
      or tables that were dropped.
      12a5fb4b
  4. 07 Jul, 2023 3 commits
    • Lawrin Novitsky's avatar
      MDEV-31064 Changes in a SP are not immediately seen in I_S.parameters · 02cd3675
      Lawrin Novitsky authored
      If procedure is changed in one connection, and other procedure has
      already called the initial version of the procedure, the query to
      INFORMATION_SCHEMA.PARAMETERS would use obsolete information from sp
      cache for that connection. That happens because cache invalidating
      method only increments cache version, and does not flush (all) the
      cache(s), and changing of a procedure only invalidates cache, and
      removes the procedure's cache entry from local thread cache only.
      
      The fix adds the check if sp info obtained from the cache for forming of
      results for the query to I_S, is not obsoleted, and does not use it, if
      it is.
      
      The test has been added to main.information_schema. It changes the SP in
      one connection, and ensures, that the change is seen in the query to the
      I_S.PARAMETERS in other connection, that already has called the
      procedure before the change.
      02cd3675
    • Yury Chaikou's avatar
      MDEV-24712 get_partition_set is never executed in... · 8fb863e6
      Yury Chaikou authored
      MDEV-24712 get_partition_set is never executed in ha_partition::multi_range_key_create_key due to bitwise & with 0 constant
      
      use == to compare enum (despite the name it is not a bit flag)
      8fb863e6
    • Oleg Smirnov's avatar
      MDEV-29284 ANALYZE doesn't work with pushed derived tables · 94a8921e
      Oleg Smirnov authored
      There was no actual execution of the SQL of a pushed derived table,
      which caused "r_rows" to be always displayed as 0 and "r_total_time_ms"
      to show inaccurate numbers.
      This commit makes a derived table SQL to be executed by the storage
      engine, so the server is able to calculate the number of rows returned
      and measure the execution time more accurately
      94a8921e
  5. 06 Jul, 2023 2 commits
    • Vlad Lesin's avatar
      MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key · 1bfd3cc4
      Vlad Lesin authored
      PROBLEM:
      A deadlock was possible when a transaction tried to "upgrade" an already
      held Record Lock to Next Key Lock.
      
      SOLUTION:
      This patch is based on observations that:
      (1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock
      (2) a GAP Lock never has to wait for any other lock
      In case we request a Next Key Lock, we check if we already own a Record
      Lock of equal or stronger mode, and if so, then we change the requested
      lock type to GAP Lock, which we either already have, or can be granted
      immediately, as GAP locks don't conflict with any other lock types.
      (We don't consider Insert Intention Locks a Gap Lock in above statements).
      
      The reason of why we don't upgrage Record Lock to Next Key Lock is the
      following.
      
      Imagine a transaction which does something like this:
      
      for each row {
          request lock in LOCK_X|LOCK_REC_NOT_GAP mode
          request lock in LOCK_S mode
      }
      
      If we upgraded lock from Record Lock to Next Key lock, there would be
      created only two lock_t structs for each page, one for
      LOCK_X|LOCK_REC_NOT_GAP mode and one for LOCK_S mode, and then used
      their bitmaps to mark all records from the same page.
      
      The situation would look like this:
      
      request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 1:
      // -> creates new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode and sets bit for
      // 1
      request lock in LOCK_S mode on row 1:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 1,
      // so it upgrades it to X
      request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 2:
      // -> creates a new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode (because we
      // don't have any after we've upgraded!) and sets bit for 2
      request lock in LOCK_S mode on row 2:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 2,
      // so it upgrades it to X
          ...etc...etc..
      
      Each iteration of the loop creates a new lock_t struct, and in the end we
      have a lot (one for each record!) of LOCK_X locks, each with single bit
      set in the bitmap. Soon we run out of space for lock_t structs.
      
      If we create LOCK_GAP instead of lock upgrading, the above scenario works
      like the following:
      
      // -> creates new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode and sets bit for
      // 1
      request lock in LOCK_S mode on row 1:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 1,
      // so it creates LOCK_S|LOCK_GAP only and sets bit for 1
      request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 2:
      // -> reuses the lock_t for LOCK_X|LOCK_REC_NOT_GAP by setting bit for 2
      request lock in LOCK_S mode on row 2:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 2,
      // so it reuses LOCK_S|LOCK_GAP setting bit for 2
      
      In the end we have just two locks per page, one for each mode:
      LOCK_X|LOCK_REC_NOT_GAP and LOCK_S|LOCK_GAP.
      Another benefit of this solution is that it avoids not-entirely
      const-correct, (and otherwise looking risky) "upgrading".
      
      The fix was ported from
      mysql/mysql-server@bfba840dfa7794b988c59c94658920dbe556075d
      mysql/mysql-server@75cefdb1f73b8f8ac8e22b10dfb5073adbdfdfb0
      
      Reviewed by: Marko Mäkelä
      1bfd3cc4
    • Alexander Barkov's avatar
      A cleanup for MDEV-30932 UBSAN: negation of -X cannot be represented in type .. · 19cdddf1
      Alexander Barkov authored
      "mtr --view-protocol func_math" failed because of a too long
      column names imlicitly generated for the underlying expressions.
      
      With --view-protocol they were replaced to "Name_exp_1".
      
      Adding column aliases for these expressions.
      19cdddf1
  6. 05 Jul, 2023 1 commit
  7. 04 Jul, 2023 2 commits
  8. 03 Jul, 2023 7 commits
  9. 29 Jun, 2023 3 commits
    • Sergei Golubchik's avatar
    • Alexander Barkov's avatar
      MDEV-30932 UBSAN: negation of -X cannot be represented in type .. · 67657a01
      Alexander Barkov authored
        'long long int'; cast to an unsigned type to negate this value ..
        to itself in Item_func_mul::int_op and Item_func_round::int_op
      
      Problems:
      
        The code in multiple places in the following methods:
          - Item_func_mul::int_op()
          - longlong Item_func_int_div::val_int()
          - Item_func_mod::int_op()
          - Item_func_round::int_op()
      
        did not properly check for corner values LONGLONG_MIN
        and (LONGLONG_MAX+1) before doing negation.
        This cuased UBSAN to complain about undefined behaviour.
      
      Fix summary:
      
        - Adding helper classes ULonglong, ULonglong_null, ULonglong_hybrid
          (in addition to their signed couterparts in sql/sql_type_int.h).
      
        - Moving the code performing multiplication of ulonglong numbers
          from Item_func_mul::int_op() to ULonglong_hybrid::ullmul().
      
        - Moving the code responsible for extracting absolute values
          from negative numbers to Longlong::abs().
          It makes sure to perform negation without undefinite behavior:
          LONGLONG_MIN is handled in a special way.
      
        - Moving negation related code to ULonglong::operator-().
          It makes sure to perform negation without undefinite behavior:
          (LONGLONG_MAX + 1) is handled in a special way.
      
        - Moving signed<=>unsigned conversion code to
          Longlong_hybrid::val_int() and ULonglong_hybrid::val_int().
      
        - Reusing old and new sql_type_int.h classes in multiple
          places in Item_func_xxx::int_op().
      
      Fix details (explain how sql_type_int.h classes are reused):
      
        - Instead of straight negation of negative "longlong" arguments
          *before* performing unsigned multiplication,
          Item_func_mul::int_op() now calls ULonglong_null::ullmul()
          using Longlong_hybrid_null::abs() to pass arguments.
          This fixes undefined behavior N1.
      
        - Instead of straight negation of "ulonglong" result
          *after* performing unsigned multiplication,
          Item_func_mul::int_op() now calls ULonglong_hybrid::val_int(),
          which recursively calls ULonglong::operator-().
          This fixes undefined behavior N2.
      
        - Removing duplicate negating code from Item_func_mod::int_op().
          Using ULonglong_hybrid::val_int() instead.
          This fixes undefinite behavior N3.
      
        - Removing literal "longlong" negation from Item_func_round::int_op().
          Using Longlong::abs() instead, which correctly handler LONGLONG_MIN.
          This fixes undefinite behavior N4.
      
        - Removing the duplicate (negation related) code from
          Item_func_int_div::val_int(). Reusing class ULonglong_hybrid.
          There were no undefinite behavior in here.
          However, this change allowed to reveal a bug in
          "-9223372036854775808 DIV 1".
          The removed negation code appeared to be incorrect when
          negating +9223372036854775808. It returned the "out of range" error.
          ULonglong_hybrid::operator-() now handles all values correctly
          and returns +9223372036854775808 as a negation for -9223372036854775808.
      
          Re-recording wrong results for
            SELECT -9223372036854775808 DIV  1;
          Now instead of "out of range", it returns -9223372036854775808,
          which is the smallest possible value for the expression data type
          (signed) BIGINT.
      
        - Removing "no UBSAN" branch from Item_func_splus::int_opt()
          and Item_func_minus::int_opt(), as it made UBSAN happy but
          in RelWithDebInfo some MTR tests started to fail.
      67657a01
    • Yuchen Pei's avatar
  10. 28 Jun, 2023 1 commit
  11. 27 Jun, 2023 3 commits
    • Sergei Golubchik's avatar
      mtr: fix the help text for debuggers · d214628a
      Sergei Golubchik authored
      d214628a
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-31086 MODIFY COLUMN can break FK constraints, and lead to unrestorable dumps · 5f09b53b
      Thirunarayanan Balathandayuthapani authored
      - When foreign_key_check is disabled, allowing to modify the
      column which is part of foreign key constraint can lead to
      refusal of TRUNCATE TABLE, OPTIMIZE TABLE later. So it make
      sense to block the column modify operation when foreign key
      is involved irrespective of foreign_key_check variable.
      
      Correct way to modify the charset of the column when fk is involved:
      
      SET foreign_key_checks=OFF;
      ALTER TABLE child DROP FOREIGN KEY fk, MODIFY m VARCHAR(200) CHARSET utf8mb4;
      ALTER TABLE parent MODIFY m VARCHAR(200) CHARSET utf8mb4;
      ALTER TABLE child ADD CONSTRAINT FOREIGN KEY (m) REFERENCES PARENT(m);
      SET foreign_key_checks=ON;
      
      fk_check_column_changes(): Remove the FOREIGN_KEY_CHECKS while
      checking the column change for foreign key constraint. This
      is the partial revert of commit 5f1f2fc0
      and it changes the behaviour of copy alter algorithm
      
      ha_innobase::prepare_inplace_alter_table(): Find the modified
      column and check whether it is part of existing and newly
      added foreign key constraint.
      5f09b53b
    • Yuchen Pei's avatar
      MDEV-29447 MDEV-26285 MDEV-31338 Refactor spider_db_mbase_util::open_item_func · 423c28f0
      Yuchen Pei authored
      spider_db_mbase_util::open_item_func() is a monster function.
      It is difficult to maintain while it is expected that we need to
      modify it when a new SQL function or a new func_type is added.
      
      We split the function into two distinct functions: one handles the
      case of str != NULL and the other handles the case of str == NULL.
      
      This refactoring was done in a conservative way because we do not
      have comprehensive tests on the function.
      
      It also fixes MDEV-29447 and MDEV-31338 where field items that are
      arguments of a func item may be used before created / initialised.
      
      Note this commit is adapted from a patch by Nayuta for MDEV-26285.
      423c28f0
  12. 26 Jun, 2023 1 commit
  13. 22 Jun, 2023 2 commits
  14. 19 Jun, 2023 1 commit
  15. 08 Jun, 2023 2 commits
  16. 07 Jun, 2023 3 commits
  17. 06 Jun, 2023 2 commits
  18. 05 Jun, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-13915: STOP SLAVE takes very long time on a busy system · 0a99d457
      Brandon Nesterenko authored
      The problem is that a parallel replica would not immediately stop
      running/queued transactions when issued STOP SLAVE. That is, it
      allowed the current group of transactions to run, and sometimes the
      transactions which belong to the next group could be started and run
      through commit after STOP SLAVE was issued too, if the last group
      had started committing. This would lead to long periods to wait for
      all waiting transactions to finish.
      
      This patch updates a parallel replica to try and abort immediately
      and roll-back any ongoing transactions. The exception to this is any
      transactions which are non-transactional (e.g. those modifying
      sequences or non-transactional tables), and any prior transactions,
      will be run to completion.
      
      The specifics are as follows:
      
       1. A new stage was added to SHOW PROCESSLIST output for the SQL
      Thread when it is waiting for a replica thread to either rollback or
      finish its transaction before stopping. This stage presents as
      “Waiting for worker thread to stop”
      
       2. Worker threads which error or are killed no longer perform GCO
      cleanup if there is a concurrently running prior transaction. This
      is because a worker thread scheduled to run in a future GCO could be
      killed and incorrectly perform cleanup of the active GCO.
      
       3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
      binlog events to disallow adding it to transactions which modify
      both transactional and non-transactional engines when the binlogging
      configuration allow the modifications to exist in the same event,
      i.e. when using binlog_direct_non_trans_update == 0 and
      binlog_format == statement.
      
       4. A few existing MTR tests relied on the completion of certain
      transactions after issuing STOP SLAVE, and were re-recorded
      (potentially with added synchronizations) under the new rollback
      behavior.
      
      Reviewed By
      ===========
      Andrei Elkin <andrei.elkin@mariadb.com>
      0a99d457