1. 24 Jan, 2021 5 commits
    • Sergei Golubchik's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · 29bbcac0
      Sergei Golubchik authored
      mutex order violation here.
      when wsrep bf thread kills a conflicting trx, the stack is
      
        wsrep_thd_LOCK()
        wsrep_kill_victim()
        lock_rec_other_has_conflicting()
        lock_clust_rec_read_check_and_lock()
        row_search_mvcc()
        ha_innobase::index_read()
        ha_innobase::rnd_pos()
        handler::ha_rnd_pos()
        handler::rnd_pos_by_record()
        handler::ha_rnd_pos_by_record()
        Rows_log_event::find_row()
        Update_rows_log_event::do_exec_row()
        Rows_log_event::do_apply_event()
        Log_event::apply_event()
        wsrep_apply_events()
      
      and mutexes are taken in the order
      
        lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
      
      When a normal KILL statement is executed, the stack is
      
        innobase_kill_query()
        kill_handlerton()
        plugin_foreach_with_mask()
        ha_kill_query()
        THD::awake()
        kill_one_thread()
      
      and mutexes are
      
        victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
      
      To fix the mutex order violation we kill the victim thd asynchronously,
      from the manager thread
      29bbcac0
    • Sergei Golubchik's avatar
      cleanup: void hton::abort_transaction() · 5d1db345
      Sergei Golubchik authored
      and void wsrep_innobase_kill_one_trx()
      
      as their return values are never used.
      Also remove redundant cast and checks that are always true
      5d1db345
    • Sergei Golubchik's avatar
    • Sergei Golubchik's avatar
      cleanup: fix and generalize handle_manager thread · 990eb093
      Sergei Golubchik authored
      * provide an argument to the callback
      * don't ignore a callback request if it's already present in the queue
      * initialize mutex/cond/in_use flag before starting the thread,
        in case the first callback queueing request arrives before
        handle_manager had time to initialize
      * set/check abort_manager under a mutex, otherwise handle_manager
        thread might destroy LOCK_manager before stop_handle_manager
        released it
      * signal COND on queueing a callback, stop cond_wait on callback request
      * always start the thread, even if flush_time is 0
      * but keep the old behavior in embedded (no replication, no galera)
      * style cleanups (e.g. remove volatile for a variable protected by a mutex)
      990eb093
    • Sergei Golubchik's avatar
      don't allow `KILL QUERY ID USER xxx` · 4a7e6229
      Sergei Golubchik authored
      4a7e6229
  2. 23 Jan, 2021 1 commit
  3. 22 Jan, 2021 4 commits
  4. 21 Jan, 2021 4 commits
    • Daniel Black's avatar
      MDEV-10272: add master host/port info to slave thread exit messages · 29d9897f
      Daniel Black authored
      Sample log error message generated:
      
      2021-01-21  2:33:24 139912137520896 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 369
      33:24 139912137520896 [Note] master was 127.0.0.1:16400
      2021-01-21  2:33:24 139912137828096 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 369
      2021-01-21  2:33:24 139912137828096 [Note] master was 127.0.0.1:16400
      
      Based on work by Hartmut Holzgraefe.
      
      Reviewer: knielsen@knielsen-hq.org, Andrei, Sachin
      29d9897f
    • Sujatha's avatar
      MDEV-8134: The relay-log is not flushed after the slave-relay-log.999999 showed · eb75e870
      Sujatha authored
      Problem:
      ========
      Auto purge of relaylogs stops when relay-log-file is
      'slave-relay-log.999999' and slave_parallel_threads is enabled.
      
      Analysis:
      =========
      The problem is that in Relay_log_info::inc_group_relay_log_pos() function,
      when two log names are compared via strcmp() function, it gives correct
      result, when log name sequence numbers are of same digits(6 digits), But
      when the number goes to 7 digits, a 999999 compares greater than
      1000000, which is wrong, hence the bug.
      
      Fix:
      ====
      Extract the numeric extension part of the file name, convert it into
      unsigned long and compare.
      
      Thanks to David Zhao for the contribution.
      eb75e870
    • Daniel Black's avatar
      maria: ma_recovery cppcheck va_start called twice · 53acd1c1
      Daniel Black authored
      Per cppcheck, va_start is called twice which it is.
      
      Remove the second instance.
      53acd1c1
    • Daniel Black's avatar
      ucs2: cppcheck - add va_end · f2fea295
      Daniel Black authored
      f2fea295
  5. 19 Jan, 2021 8 commits
  6. 18 Jan, 2021 1 commit
    • sjaakola's avatar
      MDEV-23851 BF-BF Conflict issue because of UK GAP locks · beaea31a
      sjaakola authored
      Some DML operations on tables having unique secondary keys cause scanning
      in the secondary index, for instance to find potential unique key violations
      in the seconday index. This scanning may involve GAP locking in the index.
      As this locking happens also when applying replication events in high priority
      applier threads, there is a probabality for lock conflicts between two wsrep
      high priority threads.
      
      This PR avoids lock conflicts of high priority wsrep threads, which do
      secondary index scanning e.g. for duplicate key detection.
      
      The actual fix is the patch in sql_class.cc:thd_need_ordering_with(), where
      we allow relaxed GAP locking protocol between wsrep high priority threads.
      wsrep high priority threads (replication appliers, replayers and TOI processors)
      are ordered by the replication provider, and they will not need serializability
      support gained by secondary index GAP locks.
      
      PR contains also a mtr test, which exercises a scenario where two replication
      applier threads have a false positive conflict in GAP of unique secondary index.
      The conflicting local committing transaction has to replay, and the test verifies
      also that the replaying phase will not conflict with the latter repllication applier.
      Commit also contains new test scenario for galera.galera_UK_conflict.test,
      where replayer starts applying after a slave applier thread, with later seqno,
      has advanced to commit phase. The applier and replayer have false positive GAP
      lock conflict on secondary unique index, and replayer should ignore this.
      This test scenario caused crash with earlier version in this PR, and to fix this,
      the secondary index uniquenes checking has been relaxed even further.
      
      Now innodb trx_t structure has new member: bool wsrep_UK_scan, which is set to
      true, when high priority thread is performing unique secondary index scanning.
      The member trx_t::wsrep_UK_scan is defined inside WITH_WSREP directive, to make
      it possible to prepare a MariaDB build where this additional trx_t member is
      not present and is not used in the code base. trx->wsrep_UK_scan is set to true
      only for the duration of function call for: lock_rec_lock() trx->wsrep_UK_scan
      is used only in lock_rec_has_to_wait() function to relax the need to wait if
      wsrep_UK_scan is set and conflicting transaction is also high priority.
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      beaea31a
  7. 15 Jan, 2021 2 commits
  8. 14 Jan, 2021 2 commits
  9. 13 Jan, 2021 3 commits
  10. 12 Jan, 2021 2 commits
    • Varun Gupta's avatar
      MDEV-23826: ORDER BY in view definition leads to wrong result with GROUP BY on query using view · ab271ee7
      Varun Gupta authored
      Introduced val_time_packed and val_datetime_packed  functions for Item_direct_ref
      to make sure to get the value from the item it is referring to.
      
      The issue for incorrect result was that the item was getting its value
      from the temporary table rather than from the view.
      ab271ee7
    • Varun Gupta's avatar
      MDEV-23753: SIGSEGV in Column_stat::store_stat_fields · 3b94309a
      Varun Gupta authored
      For EITS collection min and max fields are allocated for each column
      that is set in the read_set bitmap of a table. This allocation of min and max
      fields happens inside alloc_statistics_for_table.
      
      For a partitioned table ha_rnd_init is called inside the function
      collect_statistics_for_table which sets the read_set bitmap for the columns
      inside the partition expression. This happens only when there is a write lock
      on the partitioned table.
      But the allocation happens before this, so min and max fields are not allocated
      for the columns involved in the partition expression.
      This resulted in a crash, as the EITS statistics were collected but there was
      no min and max field to store the value to.
      
      The fix would be to call ha_rnd_init inside the function alloc_statistics_for_table
      that would make sure that min and max fields are allocated for the columns
      involved in the partition expression.
      3b94309a
  11. 11 Jan, 2021 8 commits