1. 22 Jul, 2020 4 commits
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-23252 Assertion failure 'req_type.is_dblwr_recover() || err ==... · 92014bd1
      Thirunarayanan Balathandayuthapani authored
      MDEV-23252 Assertion failure 'req_type.is_dblwr_recover() || err == DB_SUCCESS' for page_compressed tables
      
      - This issue is caused by a5584b13
      (MDEV-15528). os_file_punch_hole() is added to fil_io() in MDEV-15528.
      But it fails to handle failure of os_file_punch_hole(). InnoDB should
      handle the DB_IO_NO_PUNCH_HOLE error and silently transform to
      DB_SUCCESS. InnoDB should set the punch hole flag correctly when
      tablespace is loaded
      
      fil_node_t::read_page0(): Set the punch hole flag when tablespace is loaded
      
      fil_io(): Handle the DB_IO_NO_PUNCH_HOLE error
      
      buf_flush_free_pages(): Checks the punch hole condition earlier using
      tablespace punch hole flag
      92014bd1
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-23254 Replace FSP_FLAGS_HAS_PAGE_COMPRESSION with fil_space_t::is_compressed · d96027c8
      Thirunarayanan Balathandayuthapani authored
      InnoDB should replace FSP_FLAGS_HAS_PAGE_COMPRESSION check with
      fil_space_t::is_compressed(). fil_space_t::is_compressed() checks
      for both non full crc32 and crc32 format.
      d96027c8
    • Jan Lindström's avatar
      Fix regex on test. · 3d01576a
      Jan Lindström authored
      3d01576a
    • sjaakola's avatar
      MDEV-21910 Deadlock between BF abort and manual KILL command · 7bffe468
      sjaakola authored
      When high priority replication slave applier encounters lock conflict in innodb,
      it will force the conflicting lock holder transaction (victim) to rollback.
      This is a must in multi-master sychronous replication model to avoid cluster lock-up.
      This high priority victim abort (aka "brute force" (BF) abort), is started
      from innodb lock manager while holding the victim's transaction's (trx) mutex.
      Depending on the execution state of the victim transaction, it may happen that the
      BF abort will call for THD::awake() to wake up the victim transaction for the rollback.
      Now, if BF abort requires THD::awake() to be called, then the applier thread executed
      locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data
      
      If, at the same time another DBMS super user issues KILL command to abort the same victim,
      it will execute locking protocol of: victim THD::LOCK_thd_data  -> victim trx mutex.
      These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking
      deadlock may occur.
      
      The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim
      This flag is set both when BF is called for from innodb and by KILL command.
      Either path of victim killing will bail out if victim's wsrep_killed is already
      set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter
      records the aborter THD's ID. This is needed to preserve the right to kill
      the victim from different locations for the same aborter thread.
      It is also good error logging, to see who is reponsible for the abort.
      
      A new test case was added in galera.galera_bf_kill_debug.test for scenario where
      wsrep applier thread and manual KILL command try to kill same idle victim
      7bffe468
  2. 21 Jul, 2020 5 commits
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 4ec032b4
      Marko Mäkelä authored
      4ec032b4
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · b1538f4d
      Marko Mäkelä authored
      b1538f4d
    • Marko Mäkelä's avatar
      MDEV-15880: ASAN heap-use-after-free with innodb_evict_tables_on_commit_debug · b75563cd
      Marko Mäkelä authored
      trx_update_mod_tables_timestamp(): When implementing
      innodb_evict_tables_on_commit_debug, do not evict tables
      on which transactional locks exist.
      
      This debug variable was broken since its introduction in
      commit 947b0b57.
      b75563cd
    • Monty's avatar
      MDEV-16929 Assertion ... in close_thread_tables upon killing connection · e26c822a
      Monty authored
      Problem was that the code didn't handle a transaction created in innodb
      as part of a failed mysql_lock_tables()
      e26c822a
    • Monty's avatar
      MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel repl. · fc48c8ff
      Monty authored
      The issue was:
      T1, a parallel slave worker thread, is waiting for another worker thread to
      commit. While waiting, it has the MDL_BACKUP_COMMIT lock.
      T2, working for mariabackup, is doing BACKUP STAGE BLOCK_COMMIT and blocks
      all commits.
      This causes a deadlock as the thread T1 is waiting for can't commit.
      
      Fixed by moving locking of MDL_BACKUP_COMMIT from ha_commit_trans() to
      commit_one_phase_2()
      
      Other things:
      - Added a new argument to ha_comit_one_phase() to signal if the
        transaction was a write transaction.
      - Ensured that ha_maria::implicit_commit() is always called under
        MDL_BACKUP_COMMIT. This code is not needed in 10.5
      - Ensure that MDL_Request values 'type' and 'ticket' are always
        initialized. This makes it easier to check the state of the MDL_Request.
      - Moved thd->store_globals() earlier in handle_rpl_parallel_thread() as
        thd->init_for_queries() could use a MDL that could crash if store_globals
        where not called.
      - Don't call ha_enable_transactions() in THD::init_for_queries() as this
        is both slow (uses MDL locks) and not needed.
      fc48c8ff
  3. 20 Jul, 2020 13 commits
    • Eugene Kosov's avatar
      MDEV-22899 Assertion `field->col->is_binary() || field->prefix_len %... · c4d5b6b1
      Eugene Kosov authored
      MDEV-22899 Assertion `field->col->is_binary() || field->prefix_len % field->col->mbmaxlen == 0' failed in dict_index_add_to_cache
      
      is_part_of_a_key(): detect is TEXT field is a part of some key
      
      ha_innobase::can_convert_blob(): now correctly detect whether our blob
      is a part of some key. Previously the check didn't work in some cases.
      c4d5b6b1
    • Aleksey Midenkov's avatar
      MDEV-20661 Virtual fields are not recalculated on system fields value assignment · af83ed9f
      Aleksey Midenkov authored
      Fix stale virtual field value in 4 cases: when virtual field depends
      on row_start/row_end in timestamp/trx_id versioned table. row_start
      dep is recalculated in vers_update_fields() (SQL and InnoDB
      layer). row_end dep is recalculated on history row insert.
      af83ed9f
    • Aleksey Midenkov's avatar
      MDEV-22061 InnoDB: Assertion of missing row in sec index row_start upon... · af57c658
      Aleksey Midenkov authored
      MDEV-22061 InnoDB: Assertion of missing row in sec index row_start upon REPLACE on a system-versioned table
      
      make_versioned_helper() appended new update field unconditionally
      while it should check if this field already exists in update vector.
      
      Misc renames to conform versioning prefix. vers_update_fields() name
      conforms with sql layer TABLE::vers_update_fields().
      af57c658
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-22970 Possible corruption of page_compressed tables, or · c8936686
      Thirunarayanan Balathandayuthapani authored
                 when scrubbing is enabled
      
      buf_read_recv_pages(): Ignore the page to read if it is already
      present in the freed ranges.
      
      store_freed_or_init_rec(): Store the ranges only if scrubbing
      is enabled or page compressed tablespace.
      
      recv_init_crash_recovery_space(): Add the freed range only when
      scrubbing or page compressed tablespace.
      
      range_set::contains(): Search the value is present in ranges.
      
      range_set::remove_if_exists(): Remove the value if exist in ranges.
      
      mtr_t::init(): Handles the scenario that mini-transaction may allocate
      a page that had just been freed.
      
      recv_sys_t::parse(): Note down the FREE and INIT redo log irrespective
      of STORE value.
      
      Removed innodb_tablespaces_scrubbing from test case
      c8936686
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 4d4865de
      Marko Mäkelä authored
      4d4865de
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 4b959bd8
      Marko Mäkelä authored
      4b959bd8
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · acc58fd8
      Marko Mäkelä authored
      acc58fd8
    • Marko Mäkelä's avatar
      Merge 10.1 into 10.2 · ca9276e3
      Marko Mäkelä authored
      ca9276e3
    • Marko Mäkelä's avatar
      MDEV-23190 InnoDB data file extension is not crash-safe · 57ec42bc
      Marko Mäkelä authored
      When InnoDB is extending a data file, it is updating the FSP_SIZE
      field in the first page of the data file.
      
      In commit 8451e090 (MDEV-11556)
      we removed a work-around for this bug and made recovery stricter,
      by making it track changes to FSP_SIZE via redo log records, and
      extend the data files before any changes are being applied to them.
      
      It turns out that the function fsp_fill_free_list() is not crash-safe
      with respect to this when it is initializing the change buffer bitmap
      page (page 1, or generally, N*innodb_page_size+1). It uses a separate
      mini-transaction that is committed (and will be written to the redo
      log file) before the mini-transaction that actually extended the data
      file. Hence, recovery can observe a reference to a page that is
      beyond the current end of the data file.
      
      fsp_fill_free_list(): Initialize the change buffer bitmap page in
      the same mini-transaction.
      
      The rest of the changes are fixing a bug that the use of the separate
      mini-transaction was attempting to work around. Namely, we must ensure
      that no other thread will access the change buffer bitmap page before
      our mini-transaction has been committed and all page latches have been
      released.
      
      That is, for read-ahead as well as neighbour flushing, we must avoid
      accessing pages that might not yet be durably part of the tablespace.
      
      fil_space_t::committed_size: The size of the tablespace
      as persisted by mtr_commit().
      
      fil_space_t::max_page_number_for_io(): Limit the highest page
      number for I/O batches to committed_size.
      
      MTR_MEMO_SPACE_X_LOCK: Replaces MTR_MEMO_X_LOCK for fil_space_t::latch.
      
      mtr_x_space_lock(): Replaces mtr_x_lock() for fil_space_t::latch.
      
      mtr_memo_slot_release_func(): When releasing MTR_MEMO_SPACE_X_LOCK,
      copy space->size to space->committed_size. In this way, read-ahead
      or flushing will never be invoked on pages that do not yet exist
      according to FSP_SIZE.
      57ec42bc
    • Marko Mäkelä's avatar
      98e2c17e
    • Marko Mäkelä's avatar
      14543afd
    • Marko Mäkelä's avatar
      MDEV-22771 Instant extension of CHAR column is wrongly allowed · 0a7faed7
      Marko Mäkelä authored
      commit 854c219a (MDEV-17301)
      broke a constraint: Fixed-length columns cannot be extended in InnoDB
      without rebuilding the table.
      
      ha_innobase::can_convert_string(): Correct the condition. We must
      not allow any instantaneous change to the length of CHAR columns
      measured in characters. For any format other than ROW_FORMAT=REDUNDANT,
      we can allow the length in bytes to be extended if mbminlen<mbmaxlen held
      before the change of the character set.
      0a7faed7
  4. 18 Jul, 2020 1 commit
  5. 17 Jul, 2020 1 commit
  6. 16 Jul, 2020 9 commits
    • Julius Goryavsky's avatar
      MDEV-20401: revert unnecessary change · a1e52e7f
      Julius Goryavsky authored
      a1e52e7f
    • Julius Goryavsky's avatar
      MDEV-20401: revert unnecessary change · 1ba8df4c
      Julius Goryavsky authored
      1ba8df4c
    • Julius Goryavsky's avatar
    • Julius Goryavsky's avatar
      MDEV-20401: Server incorrectly auto-sets lower_case_file_system value · b3cae9db
      Julius Goryavsky authored
      Server auto-sets lower_case_file_system value based on default
      datadir's behavior instead of instead of using the directory specified
      by the user through the configuration file or command line options.
      
      This patch fixes this problem.
      b3cae9db
    • Julius Goryavsky's avatar
      MDEV-20401: Server incorrectly auto-sets lower_case_file_system value · 4412a461
      Julius Goryavsky authored
      Server auto-sets lower_case_file_system value based on default
      datadir's behavior instead of instead of using the directory specified
      by the user through the configuration file or command line options.
      
      This patch fixes this problem.
      4412a461
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 054f1036
      Marko Mäkelä authored
      054f1036
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 3280edda
      Marko Mäkelä authored
      3280edda
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · 73aa31fb
      Marko Mäkelä authored
      73aa31fb
    • Marko Mäkelä's avatar
      MDEV-21347 innodb_log_optimize_ddl=OFF is not crash safe · 147d4b1e
      Marko Mäkelä authored
      In commit 0f90728b (MDEV-16809)
      we introduced the configuration option innodb_log_optimize_ddl
      for controlling whether native index creation or table-rebuild
      in InnoDB should avoid writing full redo log.
      
      Fungo Wang reported that this option is causing occasional failures.
      The reason is that pages may be written to data files in an
      inconsistent state. Applying log records to such inconsistent pages
      may fail.
      
      The solution is to always invoke PageBulk::finish() before page latches
      may be released, to ensure that the page contents is in a consistent
      state.
      
      Something similar was implemented in MySQL 8.0.13:
      mysql/mysql-server@d1254b947354e0f5b7223b09c521bd85f22e1e31
      
      buf_block_t::skip_flush_check: Remove. Suppressing consistency checks
      is a bad idea.
      
      PageBulk::needs_finish(): New predicate: Determine whether
      PageBulk::finish() must fix up the page.
      
      PageBulk::init(): Clear PAGE_DIRECTION to ensure that needs_finish()
      will hold. We change the field from PAGE_NO_DIRECTION to 0
      and back without writing redo log. This trick avoids the need
      to introduce any new data member to PageBulk.
      
      PageBulk::insert(): Replace some high-level accessors to bypass
      debug assertions related to PAGE_HEAP_TOP that we will be violating
      until finish() has been executed.
      
      PageBulk::finish(): Tolerate m_rec_no==0. We must invoke this also
      on an empty page, to ensure that PAGE_HEAP_TOP is initialized.
      
      PageBulk::commit(): Always invoke finish().
      
      PageBulk::release(), BtrBulk::pageSplit(), BtrBulk::storeExt(),
      BtrBulk::finish(): Invoke PageBulk::finish().
      147d4b1e
  7. 15 Jul, 2020 7 commits
    • Marko Mäkelä's avatar
      Make page validation stricter · fee11c77
      Marko Mäkelä authored
      page_simple_validate_old(), page_simple_validate_new():
      Require PAGE_N_DIR_SLOTS to be at least 2.
      fee11c77
    • Marko Mäkelä's avatar
      MDEV-23183 Infinite loop on page_validate() on corrupted page · 38b4c078
      Marko Mäkelä authored
      MDEV-22721 (commit eba2d10a)
      inadvertently introduced an infinite loop.
      
      page_validate(): Remove the infinite loop.
      38b4c078
    • Daniel Black's avatar
      MDEV-23175: my_timer_milliseconds ftime deprecated - clock_gettime replacement · 20512a68
      Daniel Black authored
      Linux glibc has deprecated ftime resutlting in a compile error on Fedora-32.
      
      Per manual clock_gettime is the suggested replacement. Because my_timer_milliseconds
      is a relative time used by largely the perfomrance schema, CLOCK_MONOTONIC_COARSE
      is used. This has been available since Linux-2.6.32.
      
      The low overhead is shows in the unittest:
      
          $ unittest/mysys/my_rdtsc-t
          1..11
          # ----- Routine ---------------
          # myt.cycles.routine          :             5
          # myt.nanoseconds.routine     :            11
          # myt.microseconds.routine    :            13
          # myt.milliseconds.routine    :            18
          # myt.ticks.routine           :            17
          # ----- Frequency -------------
          # myt.cycles.frequency        :    3596597014
          # myt.nanoseconds.frequency   :    1000000000
          # myt.microseconds.frequency  :       1000000
          # myt.milliseconds.frequency  :          1039
          # myt.ticks.frequency         :           103
          # ----- Resolution ------------
          # myt.cycles.resolution       :             1
          # myt.nanoseconds.resolution  :             1
          # myt.microseconds.resolution :             1
          # myt.milliseconds.resolution :             1
          # myt.ticks.resolution        :             1
          # ----- Overhead --------------
          # myt.cycles.overhead         :           118
          # myt.nanoseconds.overhead    :           234
          # myt.microseconds.overhead   :           222
          # myt.milliseconds.overhead   :            30
          # myt.ticks.overhead          :          4946
          ok 1 - my_timer_init() did not crash
          ok 2 - The cycle timer is strictly increasing
          ok 3 - The cycle timer is implemented
          ok 4 - The nanosecond timer is increasing
          ok 5 - The nanosecond timer is implemented
          ok 6 - The microsecond timer is increasing
          ok 7 - The microsecond timer is implemented
          ok 8 - The millisecond timer is increasing
          ok 9 - The millisecond timer is implemented
          ok 10 - The tick timer is increasing
          ok 11 - The tick timer is implemented
      20512a68
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · e67daa56
      Marko Mäkelä authored
      e67daa56
    • Vladislav Vaintroub's avatar
      Fix compile warning · 9c8420fe
      Vladislav Vaintroub authored
      9c8420fe
    • Marko Mäkelä's avatar
      Revert MDEV-20453 (string_view) · ced3ec4c
      Marko Mäkelä authored
      In fsp_path_to_space_name(), we would access a byte right before
      the start of the string, tripping AddressSanitizer.
      
      This reverts commit d87006a1
      and commit a7634281.
      ced3ec4c
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 9936cfd5
      Marko Mäkelä authored
      9936cfd5