1. 07 Feb, 2020 1 commit
    • Marko Mäkelä's avatar
      MDEV-21674 purge_sys.stop() fails to wait for purge workers to complete · 8b97eba3
      Marko Mäkelä authored
      Since commit 5e62b6a5 (MDEV-16264),
      purge_sys_t::stop() no longer waited for all purge activity to stop.
      
      This caused problems on FLUSH TABLES...FOR EXPORT because of
      purge running concurrently with the buffer pool flush.
      The assertion at the end of buf_flush_dirty_pages() could fail.
      
      The, implemented by Vladislav Vaintroub, aims to eliminate race
      conditions when stopping or resuming purge:
      
      waitable_task::disable(): Wait for the task to complete, then replace
      the task callback function with noop.
      
      waitable_task::enable(): Restore the original task callback function
      after disable().
      
      purge_sys_t::stop(): Invoke purge_coordinator_task.disable().
      
      purge_sys_t::resume(): Invoke purge_coordinator_task.enable().
      
      purge_sys_t::running(): Add const qualifier, and clarify the comment.
      The purge coordinator task will remain active as long as any purge
      worker task is active.
      
      purge_worker_callback(): Assert purge_sys.running().
      
      srv_purge_wakeup(): Merge with the only caller purge_sys_t::resume().
      
      purge_coordinator_task: Use static linkage.
      8b97eba3
  2. 06 Feb, 2020 2 commits
    • Marko Mäkelä's avatar
      MDEV-18582: Fix a race condition · cd3bdc09
      Marko Mäkelä authored
      srv_export_innodb_status(): While gathering
      innodb_mem_adaptive_hash, acquire btr_search_latches[i]
      in order to prevent a race condition with buffer pool resizing.
      cd3bdc09
    • Marko Mäkelä's avatar
      MDEV-21351: Free processed recv_sys_t::blocks · 6d214415
      Marko Mäkelä authored
      Release memory as soon as redo log records are processed.
      
      Because the memory allocation and deallocation of parsed redo log
      records must be protected by recv_sys.mutex, it is better to avoid
      using a std::atomic field for bookkeeping.
      
      buf_page_t::access_time: Keep track of the recv_sys.pages record
      allocations. The most significant 16 bits will count allocated
      blocks (which were previously counted by buf_page_t::buf_fix_count
      in the debug version), and the least significant 16 bits indicate
      the number of allocated bytes in the block (which was previously
      managed in buf_block_t::modify_clock), which must be a positive
      number, up to innodb_page_size. The byte offset 65536 is represented
      as the value 0.
      
      recv_recover_page(): Let the caller erase the log.
      
      recv_validate_tablespace(): Acquire recv_sys_t::mutex.
      6d214415
  3. 05 Feb, 2020 1 commit
    • mkaruza's avatar
      Incorrect behaviour of WSREP_SYNC_WAIT_UPTO_GTID (#1442) · d0c8316b
      mkaruza authored
      Function `signal_waiters` assigned `m_committed_seqno` variable outside of
      mutex lock which caused incorrect behavior of WSREP_SYNC_WAIT_UPTO_GTID.
      Fixed by moving assignment inside lock. Added handling of OOM and now
      error is reported.
      Remove hard-coded seqno value and read seqno directly from current node state.
      d0c8316b
  4. 04 Feb, 2020 2 commits
    • Sergey Vojtovich's avatar
      libpmem cmake macros · daaa881c
      Sergey Vojtovich authored
      Also added support for MAP_SYNC. It allows to achieve decent performance
      with DAX devices even when libpmem is unavailable.
      
      Fixed Windows version of my_msync(): according to manual FlushViewOfFile()
      may return before flush is actually completed. It is advised to issue
      FlushFileBuffers() after FlushViewOfFile().
      daaa881c
    • Sujatha's avatar
      MDEV-20601: Make REPLICA a synonym for SLAVE in SQL statements · 42e825dd
      Sujatha authored
      Fix:
      ===
      Add "REPLICA" as an alias for "SLAVE". All commands which use "SLAVE" keyword
      can be used with new alias "REPLICA".
      
      List of commands:
      
      On Master:
      =========
      SHOW REPLICA HOSTS <--> SHOW SLAVE HOSTS
      Privilege "SLAVE"  <--> "REPLICA"
      
      On Slave:
      =========
      START SLAVE       <--> START REPLICA
      START ALL SLAVES  <--> START ALL REPLICAS
      START SLAVE UNTIL <--> START REPLICA UNTIL
      STOP SLAVE        <--> STOP REPLICA
      STOP ALL SLAVES   <--> STOP ALL REPLICAS
      RESET SLAVE       <--> RESET REPLICA
      RESET SLAVE ALL   <--> RESET REPLICA ALL
      SLAVE_POS         <--> REPLICA_POS
      42e825dd
  5. 03 Feb, 2020 1 commit
  6. 01 Feb, 2020 1 commit
    • Eugene Kosov's avatar
      clean up redo log · 691c691a
      Eugene Kosov authored
      main change: rename first redo log without file close
      
      second change: use os_offset_t to represent offset in a file
      
      third change: fix log texts
      691c691a
  7. 30 Jan, 2020 1 commit
  8. 29 Jan, 2020 4 commits
    • mkaruza's avatar
      Galera GTID support · 41bc7368
      mkaruza authored
      Support for galera GTID consistency thru cluster. All nodes in cluster
      should have same GTID for replicated events which are originating from cluster.
      Cluster originating commands need to contain sequential WSREP GTID seqno
      Ignore manual setting of gtid_seq_no=X.
      
      In master-slave scenario where master is non galera node replicated GTID is
      replicated and is preserved in all nodes.
      
      To have this - domain_id, server_id and seqnos should be same on all nodes.
      Node which bootstraps the cluster, to achieve this, sends domain_id and
      server_id to other nodes and this combination is used to write GTID for events
      that are replicated inside cluster.
      
      Cluster nodes that are executing non replicated events are going to have different
      GTID than replicated ones, difference will be visible in domain part of gtid.
      
      With wsrep_gtid_domain_id you can set domain_id for WSREP cluster.
      
      Functions WSREP_LAST_WRITTEN_GTID, WSREP_LAST_SEEN_GTID and
      WSREP_SYNC_WAIT_UPTO_GTID now works with "native" GTID format.
      
      Fixed galera tests to reflect this chances.
      
      Add variable to manually update WSREP GTID seqno in cluster
      
      Add variable to manipulate and change WSREP GTID seqno. Next command
      originating from cluster and on same thread will have set seqno and
      cluster should change their internal counter to it's value.
      Behavior is same as using @@gtid_seq_no for non WSREP transaction.
      41bc7368
    • Marko Mäkelä's avatar
      Cleanup: Remove mtr_state_t and mtr_t::m_state · 5defdc38
      Marko Mäkelä authored
      mtr_t::is_active(), mtr_t::is_committed(): Make debug-only.
      5defdc38
    • Marko Mäkelä's avatar
      MDEV-21362: Do not call memcmp on null pointers · c69a8629
      Marko Mäkelä authored
      Starting with commit 37344390
      we would invoke memcmp() unconditionally, even if the length is zero.
      But, a call to memcmp() is undefined if any parameter is a null pointer,
      even if the length is zero.
      
      In the following tests, a null pointer is being passed to the comparison:
      vcol.vcol_keys_innodb gcol.gcol_keys_innodb main.func_group_innodb
      innodb.innodb_bug53592
      
      cmp_data(): Keep WITH_UBSAN happy and avoid potential future bugs
      in optimized builds, like the one addressed by
      commit fc168c3a (MDEV-15587).
      c69a8629
    • Marko Mäkelä's avatar
      MDEV-21351 Replace recv_sys.heap with list of buf_block_t · 50324ce6
      Marko Mäkelä authored
      InnoDB crash recovery used a special type of mem_heap_t that
      allocates backing store from the buffer pool. That incurred
      a significant overhead, leading to underutilization of memory,
      and limiting the maximum contiguous allocated size of a log record.
      
      recv_sys_t::blocks: A linked list of buf_block_t that are allocated
      by buf_block_alloc() for redo log records. Replaces recv_sys_t::heap.
      We repurpose buf_block_t::unzip_LRU for linking the elements.
      
      recv_sys_t::max_log_blocks: Renamed from recv_n_pool_free_frames.
      
      recv_sys_t::max_blocks(): Accessor for max_log_blocks.
      
      recv_sys_t::alloc(): Allocate memory from the current recv_sys_t::blocks
      element, or allocate another block.  In debug builds, various free()
      member functions must be invoked, because we repurpose
      buf_page_t::buf_fix_count for tracking allocations.
      
      recv_sys_t::free_corrupted_page(): Renamed from recv_recover_corrupt_page()
      
      recv_sys_t::is_memory_exhausted(): Renamed from recv_sys_heap_check()
      
      recv_sys_t::pages and its elements are allocated directly by the
      system memory allocator.
      
      recv_parse_log_recs(): Remove the parameter available_memory.
      
      We rename some variables 'store_to_hash' to 'store', because
      recv_sys.pages is not actually a hash table.
      
      This is joint work with Thirunarayanan Balathandayuthapani.
      50324ce6
  9. 28 Jan, 2020 4 commits
  10. 27 Jan, 2020 2 commits
  11. 26 Jan, 2020 2 commits
  12. 25 Jan, 2020 1 commit
  13. 24 Jan, 2020 10 commits
    • Sergei Petrunia's avatar
      MDEV-21383: Possible range plan is not used under certain conditions · 7e8a5802
      Sergei Petrunia authored
      [Variant 2 of the fix: collect the attached conditions]
      
      Problem:
      make_join_select() has a section of code which starts with
       "We plan to scan all rows. Check again if we should use an index."
      
      the code in that section will [unnecessarily] re-run the range
      optimizer using this condition:
      
        condition_attached_to_current_table AND current_table's_ON_expr
      
      Note that the original invocation of range optimizer in
      make_join_statistics was done using the whole select's WHERE condition.
      Taking the whole select's WHERE condition and using multiple-equalities
      allowed the range optimizer to infer more range restrictions.
      
      The fix:
      - Do range optimization using a condition that is an AND of this table's
      condition and all of the previous tables' conditions.
      - Also, fix the range optimizer to prefer SEL_ARGs with type=KEY_RANGE
      over SEL_ARGS with type=MAYBE_KEY, regardless of the key part.
      Computing
      key_and(
        SEL_ARG(type=MAYBE_KEY key_part=1),
        SEL_ARG(type=KEY_RANGE, key_part=2)
      )
      will now produce the SEL_ARG with type=KEY_RANGE.
      7e8a5802
    • Eugene Kosov's avatar
      cleanup redo log · b534a667
      Eugene Kosov authored
      class log_file_t: more or less sane RAII wrapper around redo log file
      descriptor and its path.
      
      This change is motivated by the need of using that log_file_t somewhere else.
      b534a667
    • Oleksandr Byelkin's avatar
      fix tests · fdb9b05c
      Oleksandr Byelkin authored
      fdb9b05c
    • Oleksandr Byelkin's avatar
      Merge branch '10.3' into 10.4 · bfc24bb2
      Oleksandr Byelkin authored
      bfc24bb2
    • Oleksandr Byelkin's avatar
    • Oleksandr Byelkin's avatar
      Merge branch '10.2' into 10.3 · ceda5f72
      Oleksandr Byelkin authored
      ceda5f72
    • Oleksandr Byelkin's avatar
      Merge branch '10.1' into 10.2 · f2ccfcac
      Oleksandr Byelkin authored
      f2ccfcac
    • Marko Mäkelä's avatar
      ac3e3e12
    • Marko Mäkelä's avatar
      MDEV-16678: Ignore #sql-ib tables in --suite=parts · 6af00b2c
      Marko Mäkelä authored
      We missed these in commit 89633995
      and commit ccd87d34.
      6af00b2c
    • Sujatha's avatar
      MDEV-21490: binlog tests fail with valgrind: Conditional jump or move depends... · 599a0609
      Sujatha authored
      MDEV-21490: binlog tests fail with valgrind: Conditional jump or move depends on uninitialised value in sql_ex_info::init
      
      Problem:
      =======
      P1) Conditional jump or move depends on uninitialised value(s)
          sql_ex_info::init(char const*, char const*, bool) (log_event.cc:3083)
      
      code: All the following variables are not initialized.
      ----
        return ((cached_new_format != -1) ? cached_new_format :
          (cached_new_format=(field_term_len > 1 || enclosed_len > 1 ||
          line_term_len > 1 || line_start_len > 1 || escaped_len > 1)));
      
      P2) Conditional jump or move depends on uninitialised value(s)
          Rows_log_event::Rows_log_event(char const*, unsigned
            int, Format_description_log_event const*) (log_event.cc:9571)
      
      Code: Uninitialized values is reported for 'var_header_len' variable.
      ----
        if (var_header_len < 2 || event_len < static_cast<unsigned
            int>(var_header_len + (post_start - buf)))
      
      P3) Conditional jump or move depends on uninitialised value(s)
          Table_map_log_event::pack_info(Protocol*) (log_event.cc:11553)
      
      code:'m_table_id' is uninitialized.
      ----
        void Table_map_log_event::pack_info(Protocol *protocol)
        ...
        size_t bytes= my_snprintf(buf, sizeof(buf), "table_id: %lu (%s.%s)",
                                    m_table_id, m_dbnam, m_tblnam);
      
      Fix:
      ===
      P1 - Fix)
      Initialize cached_new_format,field_term_len, enclosed_len, line_term_len,
      line_start_len, escaped_len members in default constructor.
      
      P2 - Fix)
      "var_header_len" is initialized by reading the event buffer. In case of an
      invalid event the buffer will contain invalid data. Hence added a check to
      validate the event data. If event_len is smaller than valid header length
      return immediately.
      
      P3 - Fix)
      'm_table_id' within Table_map_log_event is initialized by reading data from
      the event buffer. Use 'VALIDATE_BYTES_READ' macro to validate the current
      state of the buffer. If it is invalid return immediately.
      599a0609
  14. 23 Jan, 2020 8 commits
    • Sergei Golubchik's avatar
      don't run main.ssl_system_ca in --embedded · 26a46444
      Sergei Golubchik authored
      this test needs a *server* and tries to connect with $MYSQL to it
      26a46444
    • Alexey Botchkov's avatar
      MENT-464 ASAN MTR quick test - some failures to be investigated. · 683a4988
      Alexey Botchkov authored
      PCRE reports small frame size working with ASAN, so the test has to be ready
      for the minimlas possible size.
      683a4988
    • Eugene Kosov's avatar
      redo log mics fixes · 34dafb7e
      Eugene Kosov authored
      os_file_flush_data_func(): fix builds on POSIX OSs where fdatasync()
      is not avaiable
      
      log_t::files::flush_data_only(): rename from fdatasync()
      
      log_t::files::fsync(): removed and replaced with flush_data_only().
      It will flush everything we need for using redo log files.
      34dafb7e
    • Marko Mäkelä's avatar
      Remove an unused tokuvalgrind script · 7aa443ca
      Marko Mäkelä authored
      This is the only symlink in the repository. Symlinks can cause
      trouble when using file systems or operating systems that do not
      support them.
      
      Also remove the unused file DartConfig.cmake that refers to the script.
      7aa443ca
    • Marko Mäkelä's avatar
      MDEV-20775: page_zip_validate() failure due to AUTO_INCREMENT · 1d12bff4
      Marko Mäkelä authored
      cmake -DWITH_INNODB_EXTRA_DEBUG:BOOL=ON
      was broken ever since commit 8777458a
      (MDEV-6076 Persistent AUTO_INCREMENT for InnoDB).
      
      There is a race condition between page reads that call
      page_zip_validate() (while holding clustered index root page S-latch)
      and writes that update PAGE_ROOT_AUTO_INC
      (with buf_block_t::lock SX-latch, compatible with S-latch).
      
      page_zip_validate_low(): Skip the PAGE_ROOT_AUTO_INC field on
      clustered index root pages in order to avoid false positives.
      1d12bff4
    • Vladislav Vaintroub's avatar
      MDEV-21551 : Assertion `m_active_threads.size() >= m_long_tasks_count +... · b19760b8
      Vladislav Vaintroub authored
      MDEV-21551 :  Assertion `m_active_threads.size() >= m_long_tasks_count + m_waiting_task_count' failed"
      
      Happened when running innodb_fts.sync_ddl
      
      m_long_task_count could be wrongly reset to 0, if m_task_queue is
      empty.
      b19760b8
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-21344: Uninitialized tbl_buf in dict_acquire_mdl_shared<false>() · 0e25a8b4
      Thirunarayanan Balathandayuthapani authored
      dict_table_t::parse_name(): Properly calculate the *tbl_name_len.
      
      A failure was easily repeatable during the test
      innodb.innodb-alter-debug for the table name test.① ("test/@2460").
      The UTF-8 representation of the U+2460 is only 3 bytes "\xe2\x91\xa0"
      while the filename-safe encoded counterpart of it in dict_table_t::name
      is 5 bytes "@2460".
      
      This bug, introduced by commit ea37b144
      (MDEV-16678), could cause a purge task to hang.
      0e25a8b4
    • Vlad Lesin's avatar
      MDEV-14183: aria_pack segfaults in compress_maria_file · 7c166e68
      Vlad Lesin authored
      Post-push fix. aria_pack_mdev14183 test is unstable.
      
      The fix is the following:
      1. Disable the test for embedded server.
      2. Create non-"transactional" Aria table in the test, as aria_pack does not
      support "transactional" Aria tables.
      7c166e68