1. 16 Sep, 2024 20 commits
    • Yuchen Pei's avatar
      MDEV-15696 Implement SHOW CREATE SERVER · 41c6b844
      Yuchen Pei authored
      One change is that if the port is not supplied or out of bound, the
      old behaviour is to print 3306. The new behaviour is to not print
      it (if not supplied) or the out of bound value.
      41c6b844
    • Yuchen Pei's avatar
      MDEV-34716 Allow arbitrary options in CREATE SERVER · 698e8fa3
      Yuchen Pei authored
      The existing syntax for CREATE SERVER
      
      CREATE [OR REPLACE] SERVER [IF NOT EXISTS] server_name
          FOREIGN DATA WRAPPER wrapper_name
          OPTIONS (option [, option] ...)
      
      option:
        { HOST character-literal
        | DATABASE character-literal
        | USER character-literal
        | PASSWORD character-literal
        | SOCKET character-literal
        | OWNER character-literal
        | PORT numeric-literal }
      
      With this change we have:
      
      option:
        { HOST character-literal
        | DATABASE character-literal
        | USER character-literal
        | PASSWORD character-literal
        | SOCKET character-literal
        | OWNER character-literal
        | PORT numeric-literal
        | PORT quoted-numerical-literal
        | identifier character-literal}
      
      We store these options as a JSON field in the mysql.servers system
      table. We retain the restriction that PORT needs to be a number, but
      also allow it to be a quoted number, so that SHOW CREATE SERVER can be
      used for dumping. Without an accompanied implementation of SHOW CREATE
      SERVER, some mysqldump tests will fail. Therefore this commit should
      be immediately followed by the one implementating SHOW CREATE SERVER,
      with testing covering both.
      698e8fa3
    • Yuchen Pei's avatar
      MDEV-34716 Fix mysql.servers socket max length too short · 4feb58e8
      Yuchen Pei authored
      The limit of socket length on unix according to libc is 108, see
      sockaddr_un::sun_path, but in the table it is a string of max length
      64, which results in truncation of socket and failure to connect by
      plugins using servers such as spider.
      4feb58e8
    • Yuchen Pei's avatar
      MDEV-34716 spider: some trivial cleanups and documentation · b200cf78
      Yuchen Pei authored
      - document tmp_share, which are temporary spider shares with only one
      link (no ha)
      - simplify spider_get_sys_tables_connect_info() where link_idx is
      always 0
      b200cf78
    • Alexey Botchkov's avatar
      4af516c4
    • Christian Gonzalez's avatar
      Make SESSION_USER() comparable with CURRENT_USER() · c88cb0b7
      Christian Gonzalez authored
      Update `SESSION_USER()` behaviour to be comparable with `CURRENT_USER()`.
      `SESSION_USER()` will return the user and host columns from `mysql.user`
      used to authenticate the user when the session was created.
      
      Historically `SESSION_USER()` was an alias of `USER()` function. The
      main difference with `USER()` behaviour after this changes is that
      `SESSION_USER()` now returns the host column from `mysql.user` instead of
      the client host or ip.
      
      NOTE: `SESSION_USER_IS_USER` old mode is added to make the change
      backward compatible.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer
      Amazon Web Services, Inc.
      c88cb0b7
    • Aleksey Midenkov's avatar
      MDEV-27293 Allow converting a versioned table from implicit · 4870d740
      Aleksey Midenkov authored
                 to explicit row_start/row_end columns
      
      In case of adding both system fields of same type (length, unsigned
      flag) as old implicit system fields do the rename of implicit system
      fields to the ones specified in ALTER, remove SYSTEM_INVISIBLE flag in
      that case. Correct PERIOD clause must be specified in ALTER as well.
      
      MDEV-34904 Inplace alter for implicit to explicit versioning is broken
      
      Whether ALTER goes inplace and how it goes inplace depends on
      handler_flags which goes from alter_info->flags by this logic:
      
        ha_alter_info->handler_flags|= (alter_info->flags & ~flags_to_remove);
      
      ALTER_VERS_EXPLICIT was not in flags_to_remove and its value (1ULL <<
      35) clashed with ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX.
      
      ALTER_VERS_EXPLICIT must not affect inplace, it is SQL-only so we
      remove it from handler_flags.
      4870d740
    • Oleg Smirnov's avatar
      MDEV-27277 Add a warning when max_sort_length is reached · 68ad4090
      Oleg Smirnov authored
      During a query execution some sorting and grouping operations
      on strings may be involved. System variable max_sort_length defines
      the maximum number of bytes to use when comparing strings during
      sorting/grouping. Thus, the comparable parts of strings may be less
      than their actual size, so the results of the query may be not
      sorted/grouped properly.
      To indicate that some comparisons were done on a truncated lengths,
      a new warning has been introduced with this commit.
      68ad4090
    • Alexander Barkov's avatar
      MDEV-27277 Add a warning when max_sort_length is reached · 882aa0a2
      Alexander Barkov authored
      Step#1: fixing the return type of strnxfrm() from size_t to this structure:
      
      typedef struct
      {
        size_t m_output_length;
        size_t m_source_length_used;
        uint m_warnings;
      } my_strnxfrm_ret_t;
      882aa0a2
    • Yuchen Pei's avatar
      MDEV-25008: UPDATE/DELETE: Cost-based choice IN->EXISTS vs Materialization · dee4a4a6
      Yuchen Pei authored
      Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for
      subqueries. This didn't allow to make a meaningful choice between
      IN->EXISTS and Materialization strategies for subqueries.
      
      Fix this:
      * Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows,
      * Then, subquery's JOIN::choose_subquery_plan() can fetch it from
      there for outer_lookup_keys
      
      Details:
      UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries()
      twice, like SELECT does (first call optimize_constant_subquries() in
      JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in
      JOIN::optimize_stage2()):
      1. Call with const_only=true before any optimizations. This allows
      range optimizer and others to use the values of cheap const
      subqueries.
      2. Call it with const_only=false after range optimizer, partition
      pruning, etc. outer_lookup_keys value is provided, so it's possible to
      pick a good subquery strategy.
      
      Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution
      performs subquery optimization for all subqueries, even for degenerate
      query plans like "Impossible WHERE". Due to that, we ensure that the
      call to optimize_unflattened_subqueries (with const_only=false) even
      for degenerate query plans still happens, as was the case before this
      change.
      dee4a4a6
    • Yuchen Pei's avatar
    • Alexander Barkov's avatar
      MDEV-15751 CURRENT_TIMESTAMP should return a TIMESTAMP [WITH TIME ZONE?] · dad24a55
      Alexander Barkov authored
      Changing the return type of the following functions:
        - CURRENT_TIMESTAMP, CURRENT_TIMESTAMP(), NOW()
        - SYSDATE()
        - FROM_UNIXTIME()
      from DATETIME to TIMESTAMP.
      
      Note, the old function NOW() returning DATETIME is still available
      as LOCALTIMESTAMP or LOCALTIMESTAMP(), e.g.:
      
        SELECT
          LOCALTIMESTAMP,     -- DATETIME
          CURRENT_TIMESTAMP;  -- TIMESTAMP
      
      The change in the functions return data type fixes some problems
      that occurred near a DST change:
      
      - Problem #1
      
      INSERT INTO t1 (timestamp_field) VALUES (CURRENT_TIMESTAMP);
      INSERT INTO t1 (timestamp_field) VALUES (COALESCE(CURRENT_TIMESTAMP));
      
      could result into two different values inserted.
      
      - Problem #2
      
      INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526));
      INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526+3600));
      
      could result into two equal TIMESTAMP values near a DST change.
      
      Additional changes:
      
      - FROM_UNIXTIME(0) now returns SQL NULL instead of '1970-01-01 00:00:00'
        (assuming time_zone='+00:00')
      
      - UNIX_TIMESTAMP('1970-01-01 00:00:00') now returns SQL NULL instead of 0
        (assuming time_zone='+00:00'
      
      These additional changes are needed for consistency with TIMESTAMP fields,
      which cannot store '1970-01-01 00:00:00 +00:00'.
      dad24a55
    • Alexander Barkov's avatar
      MDEV-12252 ROW data type for stored function return values · 3624fb78
      Alexander Barkov authored
      Adding support for the ROW data type in the stored function RETURNS clause:
      
      - explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE
      
        CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ...
      
      - anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT
      
        CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ...
      
      - anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE
      
        CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ...
      
      Adding support for anchored scalar data types in RETURNS clause:
      
      - "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT
      
        CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1;
      
      - "[db1.]table1.column1" for sql_mode=ORACLE
      
        CREATE FUNCTION f1() RETURN test.t1.column1%TYPE;
      
      Details:
      
      - Adding a new sql_mode_t parameter to
          sp_head::create()
          sp_head::sp_head()
          sp_package::create()
          sp_package::sp_package()
        to guarantee early initialization of sp_head::m_sql_mode.
        Before this change, this member was not initialized at all during
        CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used.
        Now it needs to be initialized to write properly the
        mysql.proc.returns column, according to the create time sql_mode.
      
      - Code refactoring to make the things simpler and functions smaller:
      
        * Adding a new method
          Field_row::row_create_fields(THD *thd, List<Spvar_definition> *list)
          to make a Virtual_tmp_table with Fields for ROW members
          from an explicit definition.
      
        * Adding a new method
          Field_row::row_create_fields(THD *thd, const Spvar_definition &def)
          to make a Virtual_tmp_table with Fields for ROW members
          from an explicit or a table anchored definition.
      
        * Adding a new method
          Item_args::add_array_of_item_field(THD *thd, const Virtual_tmp_table &vtable)
          to create and array of Item_field corresponding to all Field instances
          in a Virtual_tmp_table
      
        * Removing Item_field_row::row_create_items(). It was decomposed
          into the new methods described above.
      
        * Moving the code from the loop body in sp_rcontext::init_var_items()
          into a separate method Spvar_definition::make_item_field_row(),
          to make the code clearer (smaller functions).
          make_item_field_row() itself uses the new methods described above.
      
      - Changing the data type of sp_head::m_return_field_def
        from Column_definition to Spvar_definition.
        So now it supports not only SQL column field types,
        but also explicit ROW and anchored ROW data types,
        as well as anchored column types.
      
      - Adding a new Column_definition parameter to sp_head::create_result_field().
        Before this patch, create_result_field() took the definition only
        from m_return_field_def. Now it's also called with a local Column_definition
        variable which contains the explicit definition resolved from an
        anchored defition.
      
      - Modifying sql_yacc.yy to support the new grammar.
        Adding new helper methods:
          * sf_return_fill_definition_row()
          * sf_return_fill_definition_rowtype_of()
          * sf_return_fill_definition_type_of()
      
      - Fixing tests in:
        * Virtual_tmp_table::setup_field_pointers() in sql_select.cc
        * Send_field::normalize() in field.h
        * store_column_type()
        to prevent calling Type_handler_row::field_type(),
        which is implemented a DBUG_ASSERT(0).
        Before this patch the affected methods and functions were called only
        for scalar data types. Now ROW is also possible.
      
      - Adding a new virtual method Field::cols()
      
      - Overriding methods:
         Item_func_sp::cols()
         Item_func_sp::element_index()
         Item_func_sp::check_cols()
         Item_func_sp::bring_value()
        to support the ROW data type.
      
      - Extending the rule sp_return_type to support
        * explicit ROW and anchored ROW data types
        * anchored scalar data types
      
      - Overriding Field_row::sql_type() to print
        the data type of an explicit ROW.
      3624fb78
    • Sergei Golubchik's avatar
      post-merge changes · f5e4c461
      Sergei Golubchik authored
      * remove duplicate test file
      * move all uuidv7 tests into plugin/type_uuid/mysql-test/type_uuid/
      * remove mysys/ changes
      * auto my_random_bytes() fallback - removes duplicate code from uuid,
        and fixes all other users of my_random_bytes() that don't check
        the return value (because, perhaps, they don't need crypto-strong
        random bytes)
      * End of 11.6 -> 11.7 in tests
      * clarify the warning text
      * UUID_VERSION_MASK()/UUID_VARIANT_MASK() must not depend on the version
      * allow 4x more monotonic uuidv7 per millisecond - instead of stretching
        1000 microseconds over 12 bits, let's use extra 2 bits as a counter
      f5e4c461
    • StefanoPetrilli's avatar
      89e0944d
    • Daniel Black's avatar
      MDEV-32583 UUID() should be treated as stochastic for the purposes of forcing query materialization · 8ea4590a
      Daniel Black authored
      Port 9e800eda changing lex->safe_to_cache_query
      to lex->uncacheable(UNCACHEABLE_RAND).
      8ea4590a
    • Alexander Barkov's avatar
      cleanup: MDEV-11339 Implement native UUID4 function · c5cc46f3
      Alexander Barkov authored
      - Moving the class UUIDv1 into a separate file sql_type_uuid_v1.h
      
      - Adding a new class UUIDv4, similar to UUIDv1
      
      - Changing the way how my_random_bytes() failures are handled.
        Instead of raising an error it now raises a note.
        Reasoning: if we're in the middle of a multi-million row
        transaction and one UUIDv4 generation fails, it's not a good
        idea to throw away the entire transaction. Instead, let's
        generate bytes using a my_rnd() loop.
      
      - Adding a new test func_uuid_v4.test to demonstrate that the UUIDv4()
        returned type is "UUID NOT NULL".
      
      - Adding a new test func_uuidv4_debug.test to emulate my_random_bytes()
        failures
      
      - Adding a template Item_func_uuid_vx to share the code
        between the implementations of UUID() and UUIDv4().
      c5cc46f3
    • StefanoPetrilli's avatar
      2f288273
    • Sergei Golubchik's avatar
    • Monty's avatar
      MDEV-33144 Implement the Percona variable slow_query_log_always_write_time · 295c0ebf
      Monty authored
      This task is inspired by the Percona implementation of
      slow_query_log_always_write_time.
      
      This task implements the variable log_slow_always_query_time (name
      matching other MariaDB variables using the slow query log). The
      default value for the variable is 31536000, which makes MariaDB
      compatible with older installations.
      
      For queries with execution time longer than log_slow_always_query_time
      the variables log_slow_rate_limit and log_slow_min_examined_row_limit
      will be ignored and the query will be written to the slow query log
      if there is no other limitations (like log_slow_filter etc).
      
      Other things:
      - long_query_time internal variable renamed to log_slow_query_time.
      - More descriptive information for "log_slow_query_time".
      295c0ebf
  2. 14 Sep, 2024 20 commits
    • Sergei Golubchik's avatar
      70ff3b66
    • Sergei Golubchik's avatar
      11.6: 32bit fixes · 3dfcefb6
      Sergei Golubchik authored
      3dfcefb6
    • Sergey Vojtovich's avatar
      Disabled high-level indexes with Aria · 867f4562
      Sergey Vojtovich authored
      ... until a few bugs that cause server crash are fixed.
      867f4562
    • Sergei Golubchik's avatar
      5118a5c3
    • Sergei Golubchik's avatar
      cleanup: TABLE_SHARE::lock_share() helper · 90a48a45
      Sergei Golubchik authored
      also: renames, s/const/constexpr/ for consistency
      90a48a45
    • Sergey Vojtovich's avatar
      Simplified quick_rm_table() and mysql_rename_table() · bc954e75
      Sergey Vojtovich authored
      Replaced obscure FRM_ONLY, NO_FRM_RENAME, NO_HA_TABLE, NO_PAR_TABLE with
      straightforward explicit flags:
      
      QRMT_FRM - [re]moves .frm
      QRMT_PAR - [re]moves .par
      QRMT_HANDLER - calls ha_delete_table()/ha_rename_table() and [re]moves
                     high-level indexes
      QRMT_DEFAULT - same as QRMT_FRM | QRMT_HANDLER, which is regular table
                     drop/rename.
      bc954e75
    • Sergey Vojtovich's avatar
      ALTER TABLE fixes for high-level indexes · 262232bc
      Sergey Vojtovich authored
      quick_rm_table() expects .frm to exist when it removes high-level indexes.
      For cases like ALTER TABLE t1 RENAME TO t2, ENGINE=other_engine .frm was
      removed earlier.
      
      Another option would be removing high-level indexes explicitly before the
      first quick_rm_table() and skipping high-level indexes for subsequent
      quick_rm_table(NO_FRM_RENAME).
      
      But this suggested order may also help with ddl log recovery. That is
      if we crash before high-level indexes are removed, .frm is going to
      exist.
      262232bc
    • Sergey Vojtovich's avatar
      ALTER TABLE fixes for high-level indexes · 9ace07b3
      Sergey Vojtovich authored
      Disable non-copy ALTER algorithms when VECTOR index is affected. Engines
      are not supposed to handle high-level indexes anyway.
      
      Also fixed misbehaving IF [NOT] EXISTS variants.
      9ace07b3
    • Sergey Vojtovich's avatar
      ALTER TABLE fixes for high-level indexes · c0a0eb80
      Sergey Vojtovich authored
      Fixes for ALTER TABLE ... ADD/DROP COLUMN, ALGORITHM=COPY.
      
      Let quick_rm_table() remove high-level indexes along with original table.
      
      Avoid locking uninitialized LOCK_share for INTERNAL_TMP_TABLEs.
      
      Don't enable bulk insert when altering a table containing vector index.
      InnoDB can't handle situation when bulk insert is enabled for one table
      but disabled for another. We can't do bulk insert on vector index as it
      does table updates currently.
      c0a0eb80
    • Sergei Golubchik's avatar
      if we require Eigen, we can as well use it everywhere · 518f9cb7
      Sergei Golubchik authored
      it's measurably faster even in items
      518f9cb7
    • Sergei Golubchik's avatar
      VEC_Distance_Cosine() · 78bd693a
      Sergei Golubchik authored
      78bd693a
    • Sergei Golubchik's avatar
      rename VEC_Distance to VEC_Distance_Euclidean · 919e024b
      Sergei Golubchik authored
      and create a parent Item_func_vec_distance_common class
      919e024b
    • Sergei Golubchik's avatar
    • Sergei Golubchik's avatar
      145d84d8
    • Sergei Golubchik's avatar
      cleanup: extract transaction-related part of handlerton · 11b45834
      Sergei Golubchik authored
      into a separate transaction_participant structure
      
      handlerton inherits it, so handlerton itself doesn't change.
      but entities that only need to participate in a transaction,
      like binlog or online alter log, use a transaction_participant
      and no longer need to pretend to be a full-blown but invisible
      storage engine which doesn't support create table.
      11b45834
    • Sergei Golubchik's avatar
      cleanup: handlerton · 212ae15d
      Sergei Golubchik authored
      remove unused methods, reorder methods, add comments
      212ae15d
    • Sergei Golubchik's avatar
      AVX-512 support · 82d00ec4
      Sergei Golubchik authored
      82d00ec4
    • Sergei Golubchik's avatar
      subdist optimization · 92878e97
      Sergei Golubchik authored
      1. randomize all vectors via multiplication by a random orthogonal
         matrix
         * to generate the matrix fill the square matrix with normally
           distributed random values and create an orthogonal matrix with
           the QR decomposition
         * the rnd generator is seeded with the number of dimensions,
           so the matrix will be always the same for a given table
         * multiplication by an orthogonal matrix is a "rotation", so
           does not change distances or angles
      2. when calculating the distance, first calculate a "subdistance",
         the distance between projections to the first subdist_part
         coordinates (=192, best by test, if it's larger it's less efficient,
         if it's smaller the error rate is too high)
      3. calculate the full distance only if "subdistance" isn't confidently
         higher (above subdist_margin) than the distance we're comparing with
         * it might look like it would make sense to do a second projection
           at, say, subdist_part*2, and so on - but in practice one check
           is enough, the projected distance converges quickly and if it
           isn't confidently higher at subdist_part, it won't be later either
      
      This optimization introduces a constant overhead per insert/search
      operation - an input/query vector has to be multiplied by the matrix.
      And the optimization saves on every distance calculation. Thus it is only
      beneficial when a number of distance calculations (which grows with M
      and with the table size) is high enough to outweigh the constant
      overhead. Let's use MIN_ROWS table option to estimate the number of rows
      in the table. use_subdist_heuristic() is optimal for mnist and
      fashion-mnist (784 dimensions, 60k rows) and variations of gist (960
      dimensions, 200k, 400k, 600k, 800k, 1000k rows)
      92878e97
    • Sergei Golubchik's avatar
      e5d56bc2
    • Sergei Golubchik's avatar
      fix for rename · 943f008d
      Sergei Golubchik authored
      943f008d