1. 20 May, 2021 7 commits
  2. 19 May, 2021 33 commits
    • Sergei Golubchik's avatar
      MDEV-22530 Aborting OPTIMIZE TABLE still logs in binary log and replicates to the Slave server · 1fff2398
      Sergei Golubchik authored
      Followup. If the KILL happens - report it as a failure,
      don't eat it up silently. Note that this has to be done after `table_name`
      is populated, so that the error message could show it.
      1fff2398
    • Sergei Golubchik's avatar
      16d8763b
    • Sergei Golubchik's avatar
      fixes for win32 · c3b50038
      Sergei Golubchik authored
      in win32 stat_info.st_size is __int64, not size_t
      c3b50038
    • Sergei Golubchik's avatar
      show pmem detection in cmake · ecd65884
      Sergei Golubchik authored
      ecd65884
    • Sergei Golubchik's avatar
      fix compilation w/o partitioning · b2340cdd
      Sergei Golubchik authored
      b2340cdd
    • Sergei Golubchik's avatar
      switch columnstore from pre-alpha 6.1.1 back to 5.5.2-3 · f44c5e5e
      Sergei Golubchik authored
      suppress columnstore boost warning
      f44c5e5e
    • Sergei Golubchik's avatar
      remove thread_pool_priv.h · 7700626f
      Sergei Golubchik authored
      7700626f
    • Monty's avatar
      Remove not used IPC_COND_USED_INDEX · 0c7b0189
      Monty authored
      0c7b0189
    • Monty's avatar
      MDEV-25606: Concurrent CREATE TRIGGER statements mix up in binlog and break replication · acf282c3
      Monty authored
      The bug is that we don't have a a lock on the trigger name, so it is
      possible for two threads to try to create the same trigger at the same
      time and both thinks that they have succeed.
      Same thing can happen with drop trigger or a combinations of create and
      drop trigger.
      
      Fixed by adding a mdl lock for the trigger name for the duration of the
      create/drop.
      acf282c3
    • Monty's avatar
      Added checking to protect against simultaneous double free in safemalloc · 79d9a725
      Monty authored
      If two threads would call sf_free() / free_memory() at the same time,
      bad_ptr() would not detect this. Fixed by adding extra detection
      when working with the memory region under sf_mutex.
      
      Other things:
      - If safe_malloc crashes while mutex is hold, stack trace printing will
        hang because we malloc is called by my_open(), which is used by stack
        trace printing code. Fixed by adding MY_NO_REGISTER flag to my_open,
        which will disable the malloc() call to remmeber the file name.
      79d9a725
    • Monty's avatar
      Fixed hang in concurrent DROP TABLE and BACKUP LOCK BLOCK_DDL · 744a5380
      Monty authored
      The problem was that tdc_remove_referenced_share() did not take into
      account that someone could push things into share->free_tables() even
      if there is a MDL_EXCLUSIVE lock on the table.
      This can happen if flush_tables() uses the table cache to flush a
      a non transactional table to disk.
      744a5380
    • Monty's avatar
      MDEV-19198 - DBUG assert in CREATE IF NOT EXIST under LOCK TABLES WRITE · 0b59320d
      Monty authored
      Fixed the ASSERT to take care of the case when table already existed.
      0b59320d
    • Monty's avatar
      Added ER_... labels to mtr fatal error messages · 0bc3a080
      Monty authored
      0bc3a080
    • Monty's avatar
      Fix all warnings given by UBSAN · cc125beb
      Monty authored
      The 'special' cases where we disable, suppress or circumvent UBSAN are:
      - ref10 source (as here we intentionally do some shifts that UBSAN
        complains about.
      - x86 version of optimized int#korr() methods. UBSAN do not like unaligned
        memory access of integers.  Fixed by using byte_order_generic.h when
        compiling with UBSAN
      - We use smaller thread stack with ASAN and UBSAN, which forced me to
        disable a few tests that prints the thread stack size.
      - Verifying class types does not work for shared libraries. I added
        suppression in mysql-test-run.pl for this case.
      - Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
        safe to have overflows (two cases, in item_func.cc).
      
      Things fixed:
      - Don't left shift signed values
        (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
      - Don't assign not non existing values to enum variables.
      - Ensure that bool and enum values are properly initialized in
        constructors.  This was needed as UBSAN checks that these types has
        correct values when one copies an object.
        (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
      - Ensure we do not called handler functions on unallocated objects or
        deleted objects.
        (events.cc, sql_acl.cc).
      - Fixed bugs in Item_sp::Item_sp() where we did not call constructor
        on Query_arena object.
      - Fixed several cast of objects to an incompatible class!
        (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
         sql_select.cc ...)
      - Ensure we do not do integer arithmetic that causes over or underflows.
        This includes also ++ and -- of integers.
        (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
      - Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
        value_type is initialized to this instead of to -1, which is not a valid
        enum value for json_value_types.
      - Ensure we do not call memcpy() when second argument could be null.
      
      Other things:
      
      - Changed struct st_position to an OBJECT and added an initialization
        function to it to ensure that we do not copy or use uninitialized
        members. The change to a class was also motived that we used "struct
        st_position" and POSITION randomly trough the code which was
        confusing.
      - Notably big rewrite in sql_acl.cc to avoid using deleted objects.
      - Changed in sql_partition to use '^' instead of '-'. This is safe as
        the operator is either 0 or 0x8000000000000000ULL.
      - Added check for select_nr < INT_MAX in JOIN::build_explain() to
        avoid bug when get_select() could return NULL.
      - Reordered elements in POSITION for better alignment.
      - Changed sql_test.cc::print_plan() to use pointers instead of objects.
      - Fixed bug in find_set() where could could execute '1 << -1'.
      - Added variable have_sanitizer, used by mtr.  (This variable was before
        only in 10.5 and up).  It can now have one of two values:
        ASAN or UBSAN.
      - Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
        it virtual. This was an effort to get UBSAN to work with loaded storage
        engines. I kept the change as the new place is better.
      - Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
        in tabutil.cpp.
      
      Changes that should not be needed but had to be done to suppress warnings
      from UBSAN:
      
      - Added static_cast<<uint16_t>> around shift to get rid of a LOT of
        compiler warnings when using UBSAN.
      - Had to change some '/' of 2 base integers to shift to get rid of
        some compile time warnings.
      
      Fixes:
      
      MDEV-25505 Assertion `old_flags == ((my_flags & 0x10000U) ? 1 : 0)
      fixed (was caused by an old version if this commit).
      
      Reviewed by:
      - Json changes: Alexey Botchkov
      - Charset changes in ctype-uca.c: Alexander Barkov
      - InnoDB changes: Marko Mäkelä
      - sql_acl.cc changes: Vicențiu Ciorbaru
      - build_explain() changes: Sergey Petrunia
      Temporary commit to log changes for UBSAN
      cc125beb
    • Monty's avatar
      b332ffc1
    • Monty's avatar
      85f3ed5f
    • Monty's avatar
      Fixes for mtr --valgrind · 0fceb752
      Monty authored
      Disabled show_explain when used with valgrind because of random failures
      Disable ssl-big when running with --valgrind as it takes more than 1 hour
      0fceb752
    • Monty's avatar
      Made --mariadbd a synonym for --mysqld in mysql-test-run · aa1626d1
      Monty authored
      - mariadbd and mariadbd-env added
      - Changed output of print_global_resfile to use mariadbd instead of mysqld
      aa1626d1
    • Monty's avatar
      MDEV-18465 Logging of DDL statements during backup · 83e529ec
      Monty authored
      Many of the changes was needed to be able to collect and print engine
      name and table version id's in the ddl log.
      83e529ec
    • Monty's avatar
      Move debug_crash_here to it's own source files · 496a14e1
      Monty authored
      496a14e1
    • Monty's avatar
      Create a backup file of ddl_recovery.log before starting recovery · ad02d53a
      Monty authored
      This is done by prefixing -backup.log to the --log-ddl-recovery file.
      The reason for this is to have a copy of the original ddl log file
      if ddl recovery does not succeed.
      ad02d53a
    • Monty's avatar
      MDEV-25180 Atomic ALTER TABLE · 7762ee5d
      Monty authored
      MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
                 have default database
      
      The purpose of this task is to ensure that ALTER TABLE is atomic even if
      the MariaDB server would be killed at any point of the alter table.
      This means that either the ALTER TABLE succeeds (including that triggers,
      the status tables and the binary log are updated) or things should be
      reverted to their original state.
      
      If the server crashes before the new version is fully up to date and
      commited, it will revert to the original table and remove all
      temporary files and tables.
      If the new version is commited, crash recovery will use the new version,
      and update triggers, the status tables and the binary log.
      The one execption is ALTER TABLE .. RENAME .. where no changes are done
      to table definition. This one will work as RENAME and roll back unless
      the whole statement completed, including updating the binary log (if
      enabled).
      
      Other changes:
      - Added handlerton->check_version() function to allow the ddl recovery
        code to check, in case of inplace alter table, if the table in the
        storage engine is of the new or old version.
      - Added handler->table_version() so that an engine can report the current
        version of the table. This should be changed each time the table
        definition changes.
      - Added  ha_signal_ddl_recovery_done() and
        handlerton::signal_ddl_recovery_done() to inform all handlers when
        ddl recovery has been done. (Needed by InnoDB).
      - Added handlerton call inplace_alter_table_committed, to signal engine
        that ddl_log has been closed for the alter table query.
      - Added new handerton flag
        HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
        should call hton->notify_tabledef_changed() during
        mysql_inplace_alter_table. This was required as MyRocks and InnoDB
        needed the call at different times.
      - Added function server_uuid_value() to be able to generate a temporary
        xid when ddl recovery writes the query to the binary log. This is
        needed to be able to handle crashes during ddl log recovery.
      - Moved freeing of the frm definition to end of mysql_alter_table() to
        remove duplicate code and have a common exit strategy.
      
      -------
      InnoDB part of atomic ALTER TABLE
      (Implemented by Marko Mäkelä)
      innodb_check_version(): Compare the saved dict_table_t::def_trx_id
      to determine whether an ALTER TABLE operation was committed.
      
      We must correctly recover dict_table_t::def_trx_id for this to work.
      Before purge removes any trace of DB_TRX_ID from system tables, it
      will make an effort to load the user table into the cache, so that
      the dict_table_t::def_trx_id can be recovered.
      
      ha_innobase::table_version(): return garbage, or the trx_id that would
      be used for committing an ALTER TABLE operation.
      
      In InnoDB, table names starting with #sql-ib will remain special:
      they will be dropped on startup. This may be revisited later in
      MDEV-18518 when we implement proper undo logging and rollback
      for creating or dropping multiple tables in a transaction.
      
      Table names starting with #sql will retain some special meaning:
      dict_table_t::parse_name() will not consider such names for
      MDL acquisition, and dict_table_rename_in_cache() will treat such
      names specially when handling FOREIGN KEY constraints.
      
      Simplify InnoDB DROP INDEX.
      Prevent purge wakeup
      
      To ensure that dict_table_t::def_trx_id will be recovered correctly
      in case the server is killed before ddl_log_complete(), we will block
      the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
      between ha_innobase::commit_inplace_alter_table(commit=true)
      (purge_sys.stop_SYS()) and purge_sys.resume_SYS().
      The completion callback purge_sys.resume_SYS() must be between
      ddl_log_complete() and MDL release.
      
      --------
      
      MyRocks support for atomic ALTER TABLE
      (Implemented by Sergui Petrunia)
      
      Implement these SE API functions:
      - ha_rocksdb::table_version()
      - hton->check_version = rocksdb_check_versionMyRocks data dictionary
        now stores table version for each table.
        (Absence of table version record is interpreted as table_version=0,
        that is, which means no upgrade changes are needed)
      - For inplace alter table of a partitioned table, call the underlying
        handlerton when checking if the table is ok. This assumes that the
        partition engine commits all changes at once.
      7762ee5d
    • Monty's avatar
      Check if we can rename triggers before doing an ALTER TABLE ... RENAME · 3c578b0a
      Monty authored
      ALTER TABLE .. RENAME, when used with the inplace algorithm, does:
      - Do an inplace or online alter to the new definition
      - Rename to new name
      - Update triggers.
      
      If update triggers would fail, we would rename the table back.
      The problem with this approach is that the table would have the new
      definition but the rename would fail.  The binary log would also not be
      updated.
      
      The solution to this is to very early check if we can rename triggers
      and give an error if this would fail.
      Both ALTER TABLE ... RENAME and RENAME TABLE is fixed.
      
      This was implemented by moving the pre-check of rename table in triggers
      from Table_triggers_list::change_table_name() to
      Table_triggers_list::prepare_for_rename().
      3c578b0a
    • Monty's avatar
      Ensure that one can drop a trigger with an orphan .TRN file · c844a76b
      Monty authored
      Before this fix, one would get a 'Trigger ... already exists' when trying
      to create a trigger matching the original name and 'Trigger ... does not
      exists" when trying to drop it.
      
      Fixes a reported bug in MDEV-25180 Atomic ALTER TABLE
      
      MDEV-25517 Atomic DDL: Assertion `query_arg' in THD::binlog_query
      upon DROP TRIGGER
      
      The bug was that the stmt_query variable was not populated
      with the query in case of DROP TRIGGER of an orphan trigger
      (.TRN file exists & table exists, but the trigger was not in
      table->triggers).
      c844a76b
    • Monty's avatar
      MDEV-24746 Atomic CREATE TRIGGER · ffe7f19f
      Monty authored
      The purpose of this task is to ensure that CREATE TRIGGER is atomic
      
      When a trigger is created, we first create a trigger_name.TRN file and then
      create or update the table_name.TRG files.
      This is done by creating .TRN~ and .TRG~ files and replacing (or creating)
      the result files.
      
      The new logic is
      
      - Log CREATE TRIGGER to DDL log, with a marker if old trigger existsted
      - If old .TRN or .TRG files exists, make backup copies of these
      - Create the new .TRN and .TRG files as before
      - Remove the backups
      
      Crash recovery
      - If query has been logged to binary log:
        - delete any left over backup files
      - else
         - Delete any old .TRN~ or .TRG~ files
         - If there was orignally some triggers (old .TRG file existed)
            - If we crashed before creating all backup files
               - Delete existing backup files
            - else
               - Restore backup files
            - end
         - Delete .TRN and .TRG file (as there was no triggers before
      
      One benefit of the new code is that CREATE OR REPLACE TRIGGER is now
      totally atomic even if there existed an old trigger: Either the old
      trigger will be replaced or the old one will be left untouched.
      
      Other things:
      - If sql_create_definition_file() would fail, there could be memory leaks
        in CREATE TRIGGER, DROP TRIGGER or CREATE OR REPLACE TRIGGER.  This
        is now fixed.
      ffe7f19f
    • Monty's avatar
      MDEV-24607 Atomic CREATE VIEW · d494abd1
      Monty authored
      The logic of the new code is:
      - Log CREATE view to DDL log, with a marker if old view existed
      - If old view exists (in case of CREATE or REPLACE view), make a copy
        of the old view as view_name.frm-
      - Create the new view definition file
      - Delete copy of view if it was created.
      
      Crash recovery:
      - Delete view_name.frm~ file (Temporary file for view definition)
      - If query was logged to binary log
        - Delete copy of view if it exists
      - else
         -rename the copy of the view over the .frm file (restoring the
          old definition)
      
      One benefit of the new code is that CREATE OR REPLACE VIEW for an
      existing view is no fully atomic: Either the view will be replaced or
      the old one will be left unchanged.
      d494abd1
    • Monty's avatar
      MDEV-24576 Atomic CREATE TABLE · 6aa9a552
      Monty authored
      There are a few different cases to consider
      
      Logging of CREATE TABLE and CREATE TABLE ... LIKE
      - If REPLACE is used and there was an existing table, DDL log the drop of
        the table.
      - If discovery of table is to be done
          - DDL LOG create table
        else
          - DDL log create table (with engine type)
          - create the table
      - If table was created
        - Log entry to binary log with xid
        - Mark DDL log completed
      
      Crash recovery:
      - If query was in binary log do nothing and exit
      - If discoverted table
         - Delete the .frm file
      -else
         - Drop created table and frm file
      - If table was dropped, write a DROP TABLE statement in binary log
      
      CREATE TABLE ... SELECT required a little more work as when one is using
      statement logging the query is written to the binary log before commit is
      done.
      This was fixed by adding a DROP TABLE to the binary log during crash
      recovery if the ddl log entry was not closed. In this case the binary log
      will contain:
      CREATE TABLE xxx ... SELECT ....
      DROP TABLE xxx;
      
      Other things:
      - Added debug_crash_here() functionality to Aria to be able to test
        crash in create table between the creation of the .MAI and the .MAD files.
      6aa9a552
    • Monty's avatar
      MDEV-24408 Crash-safe DROP DATABASE · 7a588c30
      Monty authored
      Description of how DROP DATABASE works after this patch
      
      - Collect list of tables
      - DDL log tables as they are dropped
      - DDL log drop database
      - Delete db.opt
      - Delete data directory
      - Log either DROP TABLE or DROP DATABASE to binary log
      - De active ddl log entry
      
      This is in line of how things where before (minus ddl logging) except that
      we delete db.opt file last to not loose it if DROP DATABASE fails.
      
      On recovery we have to ensure that all dropped tables are logged in
      binary log and that they are properly dropped (as with atomic drop
      table).
      No new tables be dropped as part of recovery.
      
      Recovery of active drop database ddl log entry:
      
      - If drop database was logged to ddl log but was not found in the binary
        log:
        - drop the db.opt file and database directory.
        - Log DROP DATABASE to binary log
      - If drop database was not logged to ddl log
        - Update binary log with DROP TABLE of the dropped tables. If table list
          is longer than max_allowed_packet, then the query will be split into
          multiple DROP TABLE/VIEW queries.
      
      Other things:
      - Added DDL_LOG_STATE and 'current database' as arguments to
        mysql_rm_table_no_locks(). This was needed to be able to combine
        ddl logging of DROP DATABASE and DROP TABLE and make the generated
        DROP TABLE statements shorter.
      - To make the DROP TABLE statement created by ddl log shorter, I changed
        the binlogged query to use current directory and omit the directory
        part for all tables in the current directory.
      - Merged some DROP TABLE and DROP VIEW code in ddl logger.  This was done
        to be able get separate DROP VIEW and DROP TABLE statements in the binary
        log.
      - Added a 'recovery_state' variable to remember the state of dropped
        tables and views.
      - Moved out code that drops database objects (stored procedures) from
        mysql_rm_db_internal() to drop_database_objects() for better code reuse.
      - Made mysql_rm_db_internal() global so that could be used by the ddl
        recovery code.
      7a588c30
    • Monty's avatar
      MDEV-24395 Atomic DROP TRIGGER · 407e9b78
      Monty authored
      The purpose of this task is to ensure that DROP TRIGGER is atomic.
      
      Description of how atomic drop trigger works:
      
      Logging of DROP TRIGGER
          Log the following information:
          db
          table name
          trigger name
          xid /* Used to check if query was already logged to binary log */
          initial length of the .TRG file
          query if there is space for it, if not log a zero length query.
      
      Recovery operations:
      - Delete if exists 'database/trigger_name.TRN~'
        - If this file existed, it means that we crashed before the trigger
          was deleted and there is nothing else to do.
      - Get length of .TRG file
        - If file length is unchanged, trigger was not dropped. Nothing else to
          do.
      - Log original query to binary, if it was stored in the ddl log. If it was
        not stored (long query string), log the following query to binary log:
        use `database` ; DROP TRIGGER IF EXISTS `trigger_name`
        /* generated by ddl log */;
      
      Other things:
      - Added trigger name and DDL_LOG_STATE to drop_trigger()
        Trigger name was added to make the interface more consistent and
        more general.
      407e9b78
    • Monty's avatar
      MDEV-23844 Atomic DROP TABLE (single table) · e3cfb7c8
      Monty authored
      Logging logic:
      - Log tables, just before they are dropped, to the ddl log
      - After the last table for the statement is dropped, log an xid for the
        whole ddl log event
      
      In case of crash:
      - Remove first any active DROP TABLE events from the ddl log that matches
        xids found in binary log (this mean the drop was successful and was
        propery logged).
      - Loop over all active DROP TABLE events
        - Ensure that the table is completely dropped
      - Write a DROP TABLE entry to the binary log with the dropped tables.
      
      Other things:
      - Added code to ha_drop_table() to be able to tell the difference if
        a get_new_handler() failed because of out-of-memory or because the
        handler refused/was not able to create a a handler. This was needed
        to get sequences to work as sequences needs a share object to be passed
        to get_new_handler()
      - TC_LOG_BINLOG::recover() was changed to always collect Xid's from the
        binary log and always call ddl_log_close_binlogged_events(). This was
        needed to be able to collect DROP TABLE events with embedded Xid's
        (used by ddl log).
      - Added a new variable "$grep_script" to binlog filter to be able to find
        only rows that matches a regexp.
      - Had to adjust some test that changed because drop statements are a bit
        larger in the binary log than before (as we have to store the xid)
      
      Other things:
      - MDEV-25588 Atomic DDL: Binlog query event written upon recovery is corrupt
        fixed (in the original commit).
      e3cfb7c8
    • Monty's avatar
      MDEV-23842 Atomic RENAME TABLE · 47010ccf
      Monty authored
      - Major rewrite of ddl_log.cc and ddl_log.h
        - ddl_log.cc described in the beginning how the recovery works.
        - ddl_log.log has unique signature and is dynamic. It's easy to
          add more information to the header and other ddl blocks while still
          being able to execute old ddl entries.
        - IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting
          recovery of old logs.
        - Code is more modular and is now usable outside of partition handling.
        - Renamed log file to dll_recovery.log and added option --log-ddl-recovery
          to allow one to specify the path & filename.
      - Added ddl_log_entry_phase[], number of phases for each DDL action,
        which allowed me to greatly simply set_global_from_ddl_log_entry()
      - Changed how strings are stored in log entries, which allows us to
        store much more information in a log entry.
      - ddl log is now always created at start and deleted on normal shutdown.
        This simplices things notable.
      - Added probes debug_crash_here() and debug_simulate_error() to simply
        crash testing and allow crash after a given number of times a probe
        is executed. See comments in debug_sync.cc and rename_table.test for
        how this can be used.
      - Reverting failed table and view renames is done trough the ddl log.
        This ensures that the ddl log is tested also outside of recovery.
      - Added helper function 'handler::needs_lower_case_filenames()'
      - Extend binary log with Q_XID events. ddl log handling is using this
        to check if a ddl log entry was logged to the binary log (if yes,
        it will be deleted from the log during ddl_log_close_binlogged_events()
      - If a DDL entry fails 3 time, disable it. This is to ensure that if
        we have a crash in ddl recovery code the server will not get stuck
        in a forever crash-restart-crash loop.
      
      mysqltest.cc changes:
      - --die will now replace $variables with their values
      - $error will contain the error of the last failed statement
      
      storage engine changes:
      - maria_rename() was changed to be more robust against crashes during
        rename.
      47010ccf
    • Monty's avatar
      Make rename atomic/repeatable in MyISAM and Aria · 55c771b4
      Monty authored
      This is required to make Atomic RENAME TABLE work for these engines
      
      The requirement is that if we have a server crash in the middle of a
      storage engine rename call, the upcoming ddl log recovery should be able
      to finalize it by re-execute the rename.
      55c771b4
    • Monty's avatar
      Do not display not moved tables as moved in aria_chk · 5e7b1bad
      Monty authored
      This happened because in ma_open() we did not take into account that
      tran_man (Aria transaction manager) would not be initialized.
      Fixed by using the same check for minimum transaction id as we use
      during repair.
      
      Other things:
      - ariad_read_log now displays a readable timestamp
      - Removed printing of datapage for header. This removes
        some wrong warnings from the aria_read_log output
      5e7b1bad