1. 26 Oct, 2022 4 commits
    • Marko Mäkelä's avatar
      MDEV-29886 Assertion !index->table->is_temporary() failed in CHECK TABLE · cf96db4f
      Marko Mäkelä authored
      ha_innobase::check(): Do not enable READ UNCOMMITTED isolation level
      for temporary tables, because it would report index count mismatch
      for secondary indexes.
      
      row_check_index(): Ignore EXTENDED for temporary tables, because
      the tables are private to the current connection and there will be
      no purge of committed transaction history.
      cf96db4f
    • Marko Mäkelä's avatar
      MDEV-29883 Deadlock between InnoDB statistics update and BLOB insert · 8b6a308e
      Marko Mäkelä authored
      The test innodb.innodb-wl5522-debug would occasionally hang
      (especially when run with ./mtr --rr) due to a deadlock between
      btr_store_big_rec_extern_fields() and dict_stats_analyze_index().
      The two threads would acquire the clustered index root page latch and
      the tablespace latch in the opposite order. The deadlock was possible
      because dict_stats_analyze_index() was holding the index latch in
      shared mode and an index root page latch, while waiting for the
      tablespace latch. If a stronger dict_index_t::lock had been held
      by dict_stats_analyze_index(), any operations that free or allocate
      index pages would have been blocked.
      
      In each caller of fseg_n_reserved_pages() except ibuf_init_at_db_start()
      which is a special case for ibuf.index at database startup, we must hold
      an index latch that prevents concurrent allocation or freeing of index
      pages.
      
      Any operation that allocates or free pages that belong to an index tree
      must first acquire an index latch in Update or Exclusive mode, and while
      holding that, acquire an index root page latch in Update or Exclusive
      mode.
      
      dict_index_t::clear(): Also acquire an index latch. Otherwise,
      the test innodb.insert_into_empty could hang.
      
      btr_get_size_and_reserved(): Assert that a strong enough index latch
      is being held. Only acquire a shared fil_space_t::latch; we are only
      reading, not modifying any data.
      
      dict_stats_update_transient_for_index(),
      dict_stats_analyze_index(): Acquire a strong enough index latch. Only
      acquire a shared fil_space_t::latch.
      
      These operations had followed the same order of acquiring latches in
      every InnoDB version since the very beginning
      (commit c533308a).
      The calls for acquiring tablespace latch had previously been moved in
      commit 87839258 and
      commit 1e9c922f.
      
      The hang was introduced in
      commit 2e814d47 which imported
      mysql/mysql-server@ac74632293bea967b352d1b472abedeeaa921b98
      which failed to strengthen the locking requirements of the function
      btr_get_size().
      8b6a308e
    • Vlad Lesin's avatar
      MDEV-29869 mtr failure: innodb.deadlock_wait_thr_race · 78a04a4c
      Vlad Lesin authored
      1. The merge aeccbbd9 has overwritten
      lock0lock.cc, and the changes of MDEV-29622 and MDEV-29635 were
      partially lost, this commit restores the changes.
      
      2. innodb.deadlock_wait_thr_race test:
      
      The following hang was found during testing.
      
      There is deadlock_report_before_lock_releasing sync point in
      Deadlock::report(), which is waiting for sel_cont signal under lock_sys_t
      lock. The signal must be issued after "UPDATE t SET b = 100" rollback,
      and that rollback is executing undo record, which is blocked
      on dict_sys latch request. dict_sys is locked by the thread of statistics
      update(dict_stats_save()), and during that update lock_sys lock is
      requested, and can't be acquired as Deadlock::report() holds it. We have
      to disable statistics update to make the test stable.
      
      But even if statistics update is disabled, and transaction with consistent
      snapshot is started at the very beginning of the test to prevent purging,
      the purge can still be invoked for system tables, and it tries to open
      system table by id, what causes dict_sys.freeze() call and dict_sys
      latching. What, in combination with lock_sys::xx_lock() causes the same
      deadlock as described above. We need to disable purging globally for the
      test as well.
      
      All the above is applicable to innodb.deadlock_wait_lock_race test also.
      78a04a4c
    • Oleg Smirnov's avatar
      MDEV-29662 Replace same values in 'IN' list with an equality · 5027cb2b
      Oleg Smirnov authored
      If all elements in the list of 'IN' or 'NOT IN' clause are equal
      and there are no NULLs then clause
      -  "a IN (e1,..,en)" can be converted to "a = e1"
      -  "a NOT IN (e1,..,en)" can be converted to "a <> e1".
      This means an object of Item_func_in can be replaced with an object
      of Item_func_eq for IN (e1,..,en) clause and Item_func_ne for
      NOT IN (e1,...,en). Such a replacement allows the optimizer to choose
      a better execution plan
      5027cb2b
  2. 25 Oct, 2022 9 commits
  3. 24 Oct, 2022 9 commits
  4. 22 Oct, 2022 9 commits
    • Sergei Golubchik's avatar
      MDEV-29851 Cached role privileges are not invalidated when needed · 68fb05c3
      Sergei Golubchik authored
      GRANT ROLE can update db-level privileges -> must invalidate acl_cache
      68fb05c3
    • Sergei Golubchik's avatar
      cleanup: rename test file · 7a2f9956
      Sergei Golubchik authored
      7a2f9956
    • Sergei Golubchik's avatar
      remove two acl_cache->clear() · 741c14cb
      Sergei Golubchik authored
      * to "clear hostname cache" one needs to use hostname_cache->clear()
      * no need to clear acl_cache for SET DEFAULT ROLE
      741c14cb
    • Alexander Barkov's avatar
      MDEV-29481 mariadb-upgrade prints confusing statement · 2a57396e
      Alexander Barkov authored
      This is a new version of the patch instead of the reverted:
      
        MDEV-28727 ALTER TABLE ALGORITHM=NOCOPY does not work after upgrade
      
      Ignore the difference in key packing flags HA_BINARY_PACK_KEY and HA_PACK_KEY
      during ALTER to allow ALGORITHM=INSTANT and ALGORITHM=NOCOPY in more cases.
      
      If for some reasons (e.g. due to a bug fix such as MDEV-20704) these
      cumulative (over all segments) flags in KEY::flags are different for
      the old and new table inside compare_keys_but_name(), the difference
      in HA_BINARY_PACK_KEY and HA_PACK_KEY in KEY::flags is not really important:
      
      MyISAM and Aria can handle such cases well: per-segment flags are stored in
      MYI and MAI files anyway and they are read during ha_myisam::open()
      ha_maria::open() time. So indexes get opened with correct per-segment
      flags that were calculated during the table CREATE time, no matter
      what the old (CREATE time) and new (ALTER TIME) per-index compression
      flags are, and no matter if they are equal or not.
      
      All other engine ignore key compression flags, so this change
      is safe for other engines as well.
      2a57396e
    • Sergei Golubchik's avatar
      CONNECT: compile with libxml2 2.10.x · 16d4431a
      Sergei Golubchik authored
      storage/connect/libdoc.cpp:603:17: error: 'void xmlXPathInit()' is deprecated [-Werror=deprecated-declarations]
      16d4431a
    • Sergei Golubchik's avatar
      disable LTO in debian builds · 0609b345
      Sergei Golubchik authored
      0609b345
    • Sergei Golubchik's avatar
      MDEV-15795 Stack exceeded if pthread_attr_setstacksize(&thr_attr,8196) succeeds · 3e377fd3
      Sergei Golubchik authored
      on Linux this pthread_attr_setstacksize() fails with EINVAL
      "The stack size is less than PTHREAD_STACK_MIN (16384) bytes".
      
      But on FreeBSD it succeeds and causes a crash later, as 8196 is too little.
      
      Let's keep the stack at its default size in the timer thread.
      3e377fd3
    • Sergei Golubchik's avatar
      fix for x86 and other 32-bit little engian arch · 68391ace
      Sergei Golubchik authored
      (and for 64-bit big endian)
      68391ace
    • Haidong Ji's avatar
      Use OPENSSL_free instead of free to avoid instance crash · 45755c4e
      Haidong Ji authored
      OpenSSL handles memory management using **OPENSSL_xxx** API[^1]. For
      allocation, there is `OPENSSL_malloc`. To free it, `OPENSSL_free` should
      be called.
      
      We've been lucky that OPENSSL (and wolfSSL)'s implementation allowed the
      usage of `free` for memory cleanup. However, other OpenSSL forks, such
      as AWS-LC[^2], is not this forgiving. It will cause a server crash.
      
      Test case `openssl_1` provides good coverage for this issue. If a user
      is created using:
      `grant select on test.* to user1@localhost require SUBJECT "...";`
      user1 will crash the instance during connection under AWS-LC.
      
      There have been numerous OpenSSL forks[^3]. Due to FIPS[^4] and other
      related regulatory requirements, MariaDB will be built using them. This
      fix will increase MariaDB's adaptability by using more compliant and
      generally accepted API.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services, Inc.
      
      [^1]: https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_malloc.html
      [^2]: https://github.com/awslabs/aws-lc
      [^3]: https://en.wikipedia.org/wiki/OpenSSL#Forks
      [^4]: https://en.wikipedia.org/wiki/FIPS_140-2
      45755c4e
  5. 21 Oct, 2022 9 commits
    • Daniel Black's avatar
      MDEV-29678 Valgrind/MSAN uninitialised value errors upon PS with ALTER under ONLY_FULL_GROUP_BY · e4621718
      Daniel Black authored
      st_select_lex::init_query is called in the exectuion of EXECUTE
      IMMEDIATE 'alter table ...'. so reset the initialization at the
      same point we set join= 0.
      e4621718
    • Sergei Petrunia's avatar
      MDEV-23160: SIGSEGV in Explain_node::print_explain_for_children on UNION SELECT · 6bc2e933
      Sergei Petrunia authored
      and also MDEV-25564, MDEV-18157.
      
      Attempt to produce EXPLAIN output caused a crash in
      Explain_node::print_explain_for_children. The cause of this was that an
      Explain_node (actually a derived) had a link to child select#N, but
      there was no query plan present for select#N.
      
      The query plan wasn't present because the subquery was eliminated.
      - Either it was a degenerate subquery like "(SELECT 1)" in MDEV-25564.
      - Or it was a subquery in a UNION subquery's ORDER BY clause:
         col IN (SELECT ... UNION
                 SELECT ... ORDER BY (SELECT FROM t1))
      
      In such cases, legacy code structure in subquery/union processing code(*)
      makes it hard to detect that the subquery was eliminated, so we end up
      with EXPLAIN data structures (Explain_node::children) having dangling
      links to child subqueries.
      Do make the checks and don't follow the dangling links.
      
      (In ideal world, we should not have these dangling links. But fixing
      the code (*) would have high risk for the stable versions).
      6bc2e933
    • Anel's avatar
      MDEV-29687:ODBC tables do not quote identifier names correctly (#2295) · 0c06320a
      Anel authored
      Reviewer: andrew@mariadb.org
      0c06320a
    • Otto Kekäläinen's avatar
      Deb: Use archive.mariadb.org for upgrade testing in Salsa-CI (#2294) · dca4fc24
      Otto Kekäläinen authored
      The official deb.mariadb.org mirrors are intended for distribution of the
      current MariaDB releases. When a version goes end-of-life, they are
      removed from those mirrors.
      
      The upgrade tests should however work even after EOL. While we do want
      users to stop using EOL versions, we still expect the newer versions to
      support upgrades from old versions to the current versions. Therefore we
      should continue testing upgrades from EOL versions, and for that to work,
      switch the CI to use the archive.mariadb.org repositories instead.
      
      MERGE NOTE: This commit was made on the oldest branch with the salsa-ci.yml
      file. When merging 10.5->10.6->...->10.12 please include this commit in
      the merge and ensure all files end up with the change:
      
          deb.mariadb.org/10.([0-9]+)/ -> archive.mariadb.org/mariadb-10.$1/repo/
      dca4fc24
    • Vlad Lesin's avatar
      MDEV-29622 Wrong assertions in lock_cancel_waiting_and_release() for deadlock resolving caller · 9c04d66d
      Vlad Lesin authored
      Suppose we have two transactions, trx 1 and trx 2.
      
      trx 2 does deadlock resolving from lock_wait(), it sets
      victim->lock.was_chosen_as_deadlock_victim=true for trx 1, but has not
      yet invoked lock_cancel_waiting_and_release().
      
      trx 1 checks the flag in lock_trx_handle_wait(), and starts rollback
      from row_mysql_handle_errors(). It can change trx->lock.wait_thr and
      trx->state as it holds trx_t::mutex, but trx 2 has not yet requested it,
      as lock_cancel_waiting_and_release() has not yet been called.
      
      After that trx 1 tries to release locks in trx_t::rollback_low(),
      invoking trx_t::rollback_finish(). lock_release() is blocked on try to
      acquire lock_sys.rd_lock(SRW_LOCK_CALL) in lock_release_try(), as
      lock_sys is blocked by trx 2, as deadlock resolution works under
      lock_sys.wr_lock(SRW_LOCK_CALL), see Deadlock::report() for details.
      
      trx 2 executes lock_cancel_waiting_and_release() for deadlock victim, i.
      e. for trx 1. lock_cancel_waiting_and_release() contains some
      trx->lock.wait_thr and trx->state assertions, which will fail, because
      trx 1 has changed them during rollback execution.
      
      So, according to the above scenario, it's legal to have
      trx->lock.wait_thr==0 and trx->state!=TRX_STATE_ACTIVE in
      lock_cancel_waiting_and_release(), if it was invoked from
      Deadlock::report(), and the fix is just in the assertion conditions
      changing.
      
      The fix is just in changing assertion condition.
      
      There is also lock_wait() cleanup around trx->error_state.
      
      If trx->error_state can be changed not by the owned thread, it must be
      protected with lock_sys.wait_mutex, as lock_wait() uses trx->lock.cond
      along with that mutex.
      
      Also if trx->error_state was changed before lock_sys.wait_mutex
      acquision, then it could be reset with the following code, what is
      wrong. Also we need to check trx->error_state before entering waiting
      loop, otherwise it can be the case when trx->error_state was set before
      lock_sys.wait_mutex acquision, but the thread will be waiting on
      trx->lock.cond.
      9c04d66d
    • Vlad Lesin's avatar
      MDEV-29635 race on trx->lock.wait_lock in deadlock resolution · acebe357
      Vlad Lesin authored
      Returning DB_SUCCESS unconditionally if !trx->lock.wait_lock in
      lock_trx_handle_wait() is wrong. Because even if
      trx->lock.was_chosen_as_deadlock_victim was not set before the first check
      in lock_trx_handle_wait(), it can be set after
      the check, and trx->lock.wait_lock can be reset by another thread from
      lock_reset_lock_and_trx_wait() if the transaction was chosen as deadlock
      victim. In this case lock_trx_handle_wait() will return DB_SUCCESS even
      the transaction was marked as deadlock victim, and continue execution
      instead of rolling back.
      
      The fix is to check trx->lock.was_chosen_as_deadlock_victim once more if
      trx->lock.wait_lock is reset, as trx->lock.wait_lock can be reset only
      after trx->lock.was_chosen_as_deadlock_victim was set if the transaction
      was chosen as deadlock victim.
      acebe357
    • Daniel Black's avatar
      MDEV-29615 mtr to use mariadb names · 7afc6ee8
      Daniel Black authored
      7afc6ee8
    • Marko Mäkelä's avatar
      MDEV-24402: InnoDB CHECK TABLE ... EXTENDED · ab019010
      Marko Mäkelä authored
      Until now, the attribute EXTENDED of CHECK TABLE was ignored by InnoDB,
      and InnoDB only counted the records in each index according
      to the current read view. Unless the attribute QUICK was specified, the
      function btr_validate_index() would be invoked to validate the B-tree
      structure (the sibling and child links between index pages).
      
      The EXTENDED check will not only count all index records according to the
      current read view, but also ensure that any delete-marked records in the
      clustered index are waiting for the purge of history, and that all
      secondary index records point to a version of the clustered index record
      that is waiting for the purge of history. In other words, no index may
      contain orphan records. Normal MVCC reads and the non-EXTENDED version
      of CHECK TABLE would ignore these orphans.
      
      Unpurged records merely result in warnings (at most one per index),
      not errors, and no indexes will be flagged as corrupted due to such
      garbage. It will remain possible to SELECT data from such indexes or
      tables (which will skip such records) or to rebuild the table to
      reclaim some space.
      
      We introduce purge_sys.end_view that will be (almost) a copy of
      purge_sys.view at the end of a batch of purging committed transaction
      history. It is not an exact copy, because if the size of a purge batch
      is limited by innodb_purge_batch_size, some records that
      purge_sys.view would allow to be purged will be left over for
      subsequent batches.
      
      The purge_sys.view is relevant in the purge of committed transaction
      history, to determine if records are safe to remove. The new
      purge_sys.end_view is relevant in MVCC operations and in
      CHECK TABLE ... EXTENDED. It tells which undo log records are
      safe to access (have not been discarded at the end of a purge batch).
      
      purge_sys.clone_oldest_view<true>(): In trx_lists_init_at_db_start(),
      clone the oldest read view similar to purge_sys_t::clone_end_view()
      so that CHECK TABLE ... EXTENDED will not report bogus failures between
      InnoDB restart and the completed purge of committed transaction history.
      
      purge_sys_t::is_purgeable(): Replaces purge_sys_t::changes_visible()
      in the case that purge_sys.latch will not be held by the caller.
      Among other things, this guards access to BLOBs. It is not safe to
      dereference any BLOBs of a delete-marked purgeable record, because
      they may have already been freed.
      
      purge_sys_t::view_guard::view(): Return a reference to purge_sys.view
      that will be protected by purge_sys.latch, held by purge_sys_t::view_guard.
      
      purge_sys_t::end_view_guard::view(): Return a reference to
      purge_sys.end_view while it is protected by purge_sys.end_latch.
      Whenever a thread needs to retrieve an older version of a clustered
      index record, it will hold a page latch on the clustered index page
      and potentially also on a secondary index page that points to the
      clustered index page. If these pages contain purgeable records that
      would be accessed by a currently running purge batch, the progress of
      the purge batch would be blocked by the page latches. Hence, it is
      safe to make a copy of purge_sys.end_view while holding an index page
      latch, and consult the copy of the view to determine whether a record
      should already have been purged.
      
      btr_validate_index(): Remove a redundant check.
      
      row_check_index_match(): Check if a secondary index record and a
      version of a clustered index record match each other.
      
      row_check_index(): Replaces row_scan_index_for_mysql().
      Count the records in each index directly, duplicating the relevant
      logic from row_search_mvcc(). Initialize check_table_extended_view
      for CHECK ... EXTENDED while holding an index leaf page latch.
      If we encounter an orphan record, the copy of purge_sys.end_view that
      we make is safe for visibility checks, and trx_undo_get_undo_rec() will
      check for the safety to access each undo log record. Should that check
      fail, we should return DB_MISSING_HISTORY to report a corrupted index.
      The EXTENDED check tries to match each secondary index record with
      every available clustered index record version, by duplicating the logic
      of row_vers_build_for_consistent_read() and invoking
      trx_undo_prev_version_build() directly.
      
      Before invoking row_check_index_match() on delete-marked clustered index
      record versions, we will consult purge_sys.is_purgeable() in order to
      avoid accessing freed BLOBs.
      
      We will always check that the DB_TRX_ID or PAGE_MAX_TRX_ID does not
      exceed the global maximum. Orphan secondary index records will be
      flagged only if everything up to PAGE_MAX_TRX_ID has been purged.
      We warn also about clustered index records whose nonzero DB_TRX_ID
      should have been reset in purge or rollback.
      
      trx_set_rw_mode(): Move an assertion from ReadView::set_creator_trx_id().
      
      trx_undo_prev_version_build(): Remove two debug-only parameters,
      and return an error code instead of a Boolean.
      
      trx_undo_get_undo_rec(): Return a pointer to the undo log record,
      or nullptr if one cannot be retrieved. Instead of consulting the
      purge_sys.view, consult the purge_sys.end_view to determine which
      records can be accessed.
      
      trx_undo_get_rec_if_purgeable(): A variant of trx_undo_get_undo_rec()
      that will consult purge_sys.view instead of purge_sys.end_view.
      
      TRX_UNDO_CHECK_PURGEABILITY: A new parameter to
      trx_undo_prev_version_build(), passed by row_vers_old_has_index_entry()
      so that purge_sys.view instead of purge_sys.end_view will be consulted
      to determine whether a secondary index record may be safely purged.
      
      row_upd_changes_disowned_external(): Remove. This should be more
      expensive than briefly latching purge_sys in trx_undo_prev_version_build()
      (which may make use of transactional memory).
      
      row_sel_reset_old_vers_heap(): New function, split from
      row_sel_build_prev_vers_for_mysql().
      
      row_sel_build_prev_vers_for_mysql(): Reorder some parameters
      to simplify the call to row_sel_reset_old_vers_heap().
      
      row_search_for_mysql(): Replaced with direct calls to row_search_mvcc().
      
      sel_node_get_nth_plan(): Define inline in row0sel.h
      
      open_step(): Define at the call site, in simplified form.
      
      sel_node_reset_cursor(): Merged with the only caller open_step().
      ---
      ReadViewBase::check_trx_id_sanity(): Remove.
      Let us handle "future" DB_TRX_ID in a more meaningful way:
      
      row_sel_clust_sees(): Return DB_SUCCESS if the record is visible,
      DB_SUCCESS_LOCKED_REC if it is invisible, and DB_CORRUPTION if
      the DB_TRX_ID is in the future.
      
      row_undo_mod_must_purge(), row_undo_mod_clust(): Silently ignore
      corrupted DB_TRX_ID. We are in ROLLBACK, and we should have noticed
      that corruption when we were about to modify the record in the first
      place (leading us to refuse the operation).
      
      row_vers_build_for_consistent_read(): Return DB_CORRUPTION if
      DB_TRX_ID is in the future.
      
      Tested by: Matthias Leich
      Reviewed by: Vladislav Lesin
      ab019010
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-29778 Having Unique index interference with MATCH from a FULLTEXT · e1414fc7
      Thirunarayanan Balathandayuthapani authored
      InnoDB fails to fetch FTS_DOC_ID if the select query uses secondary
      index. So always do extra lookup on clustered index in case of fts
      query
      e1414fc7