1. 18 Mar, 2019 4 commits
    • Daniel Black's avatar
      MDEV-18726: innodb buffer pool size not consistent with large pages · de51acd0
      Daniel Black authored
      Rather than add a small extra amount on the size of chunks, keep it
      of the specified size. The rest of the chunk initialization code
      adapts to this small size reduction. This has been made in the general
      case, not just large pages, to keep it simple.
      
      The chunks size is controlled by innodb-buffer-pool-chunk-size. In the
      code increasing this by a descriptor table size length makes it
      difficult with large pages. With innodb-buffer-pool-chunk-size set to 2M
      the code before this commit would of added a small amount extra to this
      value when it tried to allocate this. While not normally a problem it is
      with large pages, it now requires addition space, a whole extra large
      page. With a number of pools, or with 1G or 16G large pages this is
      quite significant.
      
      By removing this additional amount, DBAs can set
      innodb-buffer-pool-chunk size to the large page size, or a multiple of
      it, and actually get that amount allocated. Previously they had to fudge
      a value less.
      
      The innodb.test results show how this is fudged over a number of tests. With
      this change the values are just between 488 and 500 depending on architecture
      and build options.
      
      Tested with  --large-pages --innodb-buffer-pool-size=256M
      --innodb-buffer-pool-chunk-size=2M on x86_64 with 2M default large page
      size. Breaking before buf_pool init, one large page was allocated in
      MyISAM, by the end of the function 128 huge pages where allocated as
      expected. A further 16 pages where allocated for a 32M log buffer and
      during startup 1 page was allocated briefly to the redo log.
      de51acd0
    • Marko Mäkelä's avatar
      MDEV-18644: Support full_crc32 for page_compressed · 6b6fa3cd
      Marko Mäkelä authored
      This is a follow-up task to MDEV-12026, which introduced
      innodb_checksum_algorithm=full_crc32 and a simpler page format.
      MDEV-12026 did not enable full_crc32 for page_compressed tables,
      which we will be doing now.
      
      This is joint work with Thirunarayanan Balathandayuthapani.
      
      For innodb_checksum_algorithm=full_crc32 we change the
      page_compressed format as follows:
      
      FIL_PAGE_TYPE: The most significant bit will be set to indicate
      page_compressed format. The least significant bits will contain
      the compressed page size, rounded up to a multiple of 256 bytes.
      
      The checksum will be stored in the last 4 bytes of the page
      (whether it is the full page or a page_compressed page whose
      size is determined by FIL_PAGE_TYPE), covering all preceding
      bytes of the page. If encryption is used, then the page will
      be encrypted between compression and computing the checksum.
      For page_compressed, FIL_PAGE_LSN will not be repeated at
      the end of the page.
      
      FSP_SPACE_FLAGS (already implemented as part of MDEV-12026):
      We will store the innodb_compression_algorithm that may be used
      to compress pages. Previously, the choice of algorithm was written
      to each compressed data page separately, and one would be unable
      to know in advance which compression algorithm(s) are used.
      
      fil_space_t::full_crc32_page_compressed_len(): Determine if the
      page_compressed algorithm of the tablespace needs to know the
      exact length of the compressed data. If yes, we will reserve and
      write an extra byte for this right before the checksum.
      
      buf_page_is_compressed(): Determine if a page uses page_compressed
      (in any innodb_checksum_algorithm).
      
      fil_page_decompress(): Pass also fil_space_t::flags so that the
      format can be determined.
      
      buf_page_is_zeroes(): Check if a page is full of zero bytes.
      
      buf_page_full_crc32_is_corrupted(): Renamed from
      buf_encrypted_full_crc32_page_is_corrupted(). For full_crc32,
      we always simply validate the checksum to the page contents,
      while the physical page size is explicitly specified by an
      unencrypted part of the page header.
      
      buf_page_full_crc32_size(): Determine the size of a full_crc32 page.
      
      buf_dblwr_check_page_lsn(): Make this a debug-only function, because
      it involves potentially costly lookups of fil_space_t.
      
      create_table_info_t::check_table_options(),
      ha_innobase::check_if_supported_inplace_alter(): Do allow the creation
      of SPATIAL INDEX with full_crc32 also when page_compressed is used.
      
      commit_cache_norebuild(): Preserve the compression algorithm when
      updating the page_compression_level.
      
      dict_tf_to_fsp_flags(): Set the flags for page compression algorithm.
      FIXME: Maybe there should be a table option page_compression_algorithm
      and a session variable to back it?
      6b6fa3cd
    • Marko Mäkelä's avatar
      Follow-up fix to MDEV-12026: FIL_SPACE_FLAGS trump fil_space_t::flags · 2151aed4
      Marko Mäkelä authored
      Whenever we are reading the first page of a data file, we may have to
      adjust the provisionally created fil_space_t::flags to match what is
      actually inside the data files. In this way, we will never accidentally
      change the format of a data file.
      
      fil_node_t::read_page0(): After validating the FIL_SPACE_FLAGS,
      always assign them to space->flags.
      
      btr_root_adjust_on_import(), Datafile::validate_to_dd(),
      fil_space_for_table_exists_in_mem(): Adapt to the fix
      in fil_node_t::read_page0().
      
      fsp_flags_try_adjust(): Skip the adjustment if full_crc32 is being
      used. This adjustment was introduced in MDEV-11623 for upgrading
      from MariaDB 10.1.0 to 10.1.20, which used an accidentally changed
      format of FIL_SPACE_FLAGS. MariaDB before 10.4.3 never set the
      flag that now indicates the full_crc32 format.
      2151aed4
    • sachinsetia1001@gmail.com's avatar
  2. 17 Mar, 2019 2 commits
  3. 15 Mar, 2019 9 commits
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      MDEV-18640: Correct a result · 3dd477db
      Marko Mäkelä authored
      Follow-up fix for commit b234f810
      3dd477db
    • Daniele Sciascia's avatar
      MDEV-18666 Fix MTR test galera_sr_kill_all_norecovery (#1229) · a2365767
      Daniele Sciascia authored
      * Disable `wsrep_sync_wait` before killing galera node which may be
        non-primary (kill_galera.inc causes lock wait timeouts  due to
        wsrep_sync_wait)
      
      * Remove unnecessary `--sleep 1`
      
      * Replace ```SELECT COUNT(*) = 0 ...``` with
        ```SELECT COUNT(*) `expect 0` ...```
      a2365767
    • sachinsetia1001@gmail.com's avatar
      MDEV-18809 Server crash in fields_in_hash_keyinfo or Assertion... · 2e34a031
      sachinsetia1001@gmail.com authored
      MDEV-18809 Server crash in fields_in_hash_keyinfo or Assertion `key_info->key_part->field->flags & (1<< 30)' failed in setup_keyinfo_hash
      
      Move calling setup_keyinfo_hash until all continue is exhausted.
      And also call re_setup_keyinfo_hash for goto err.
      2e34a031
    • Daniele Sciascia's avatar
    • Jan Lindström's avatar
      Disable mysql-wsrep#198 · d27aa35e
      Jan Lindström authored
      d27aa35e
    • sachinsetia1001@gmail.com's avatar
      MDEV-18922 Alter on long unique varchar column makes result null · 050280ce
      sachinsetia1001@gmail.com authored
      Don't add long key into share->keys_for_keyread
      050280ce
    • Teemu Ollakka's avatar
      10.4 wsrep group commit fixes (#1224) · 1ef50a34
      Teemu Ollakka authored
      * MDEV-16509 Improve wsrep commit performance with binlog disabled
      
      Release commit order critical section early after trx_commit_low() if
      binlog is not transaction coordinator. In order to avoid two phase commit,
      binlog_hton is not registered for THD during IO_CACHE population.
      
      Implemented a test which verifies that the transactions release
      commit order early.
      
      This optimization will change behavior during recovery as the commit
      is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
      and wsrep-recover to match the behavior.
      
      * MDEV-18730 Ordering for wsrep binlog group commit
      
      Previously out of order execution was allowed for wsrep commits.
      Established proper ordering by populating wait_for_commit
      for every wsrep THD and making group commit leader to wait for
      prior commits before proceeding to trx_group_commit_leader().
      
      * MDEV-18730 Added a test case to verify correct commit ordering
      
      * MDEV-16509, MDEV-18730 Review fixes
      
      Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
      should be registered. Whitespace/syntax fixes and cleanups.
      
      * MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test
      
      If the commit to InnoDB is done in one phase, the native InnoDB behavior
      is that the transaction is committed in memory before it is persisted to
      disk. This means that the innodb_disallow_writes=ON may not prevent
      transaction to become visible to other readers before commit is completely
      over. On the other hand, if the commit is two phase (as it is with binlog),
      the transaction will be blocked in prepare phase.
      
      Fixed the test to use binlog, which enforces two phase commit, which
      in turn makes commit to block before the changes become visible to
      other connections. This guarantees that the test produces expected
      result.
      1ef50a34
    • Igor Babaev's avatar
      MDEV-18640 TABLE::prune_range_rowid_filters: Conditional jump or move · b234f810
      Igor Babaev authored
                 depends on uninitialized value
      
      This problem most probably was resolved by the patch for MDEV-18816.
      This commit adds only the test case from the bug entry.
      b234f810
  4. 14 Mar, 2019 4 commits
  5. 13 Mar, 2019 7 commits
  6. 12 Mar, 2019 11 commits
    • Sergey Vojtovich's avatar
      MDEV-18450 Slaves wait shutdown · 3568427d
      Sergey Vojtovich authored
      The patches features an optional shutdown behavior to hold on until
      after all connected slaves have been sent the last binlogged event.
      The connected slave is one whose START SLAVE has been acknowledged and
      that was not stopped since that though it could be technically
      reconnecting in background.
      
      The solution therefore disallows killing the dump thread until is has
      found EOF of the latest binlog file.  It is up to the shutdown
      requester (DBA) to set up a sufficiently large shutdown timeout value
      for shudown to wait patiently until lagging behind slaves have been
      synchronized. On the other hand if a specific slave needs exclusion
      from synchronization the DBA would have to stop it manually which
      would terminate its dump thread.
      
      `mysqladmin shutdown' is extended with a `--wait_for_all_slaves' option
      which translates to `SHUTDOW WAIT FOR ALL SLAVES' sql query
      to enable the feature on the client side.
      
      The patch also performs a small refactoring of the server shutdown
      around close_connections() to introduce kill thread phases which
      are two as of current.
      3568427d
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · e4505279
      Marko Mäkelä authored
      e4505279
    • Marko Mäkelä's avatar
      MDEV-18878: After-merge fixes · 69b33fca
      Marko Mäkelä authored
      In 10.3, all records will be processed by purge due to MDEV-12288.
      But, the insert undo records do not contain a transaction identifier.
      
      row_purge_parse_undo_rec(): Use node->trx_id=TRX_ID_MAX for the
      insert undo records. We cannot skip table lookups for these records
      after DISCARD TABLESPACE other than by 'detaching' the table from
      the undo logs by updating SYS_TABLES.ID on both DISCARD TABLESPACE
      and IMPORT TABLESPACE.
      
      Also, remove a redundant condition that was introduced
      in the merge commit 814205f3.
      69b33fca
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · b32bc70e
      Marko Mäkelä authored
      b32bc70e
    • Marko Mäkelä's avatar
      Add an end-of-tests marker to ease merges · f72760df
      Marko Mäkelä authored
      f72760df
    • Marko Mäkelä's avatar
      MDEV-18902 Uninitialized variable in recv_parse_log_recs() · bef947b4
      Marko Mäkelä authored
      recv_parse_log_recs(): Do not compare type if ptr==end_ptr
      (we have reached the end of the redo log parsing buffer),
      because it will not have been correctly initialized in that case.
      bef947b4
    • Marko Mäkelä's avatar
      MDEV-18878: Fix GCC -flifetime-dse · e070cfe3
      Marko Mäkelä authored
      GCC 6 and later can optimize away the memset() that is part of
      mem_heap_zalloc() in a placement new call. So, instead of relying
      on that kind of initialization, explicitly initialize the necessary
      fields in the constructors.
      
      que_common_t::que_common_t(): Initialize more fields in the
      default constructor.
      
      purge_vcol_info_t::purge_vcol_info_t(): Initialize all fields in
      the default constructor.
      
      purge_node_t::purge_node_t(): Initialize all necessary fields.
      
      Reference:
      
          https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388
      
          https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html
      e070cfe3
    • Marko Mäkelä's avatar
      Merge 10.1 into 10.2 · e374755b
      Marko Mäkelä authored
      e374755b
    • Marko Mäkelä's avatar
      MDEV-18749: Fix GCC -flifetime-dse · 32de60bb
      Marko Mäkelä authored
      row_merge_create_fts_sort_index(): Initialize dict_col_t in
      an unambiguous way. GCC 6 and later appear to be able to optimize
      away the memset() that is part of mem_heap_zalloc() in the
      placement new call. Let us avoid using placement new in order
      to ensure that the objects will actually be initialized.
      
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388
      
      https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html
      
      While the latter reference hints that the optimization is only
      applicable to non-POD types (and dict_col_t does not define
      any member functions before 10.2), it is most consistent to
      use the same initialization across all versions.
      32de60bb
    • Sergei Golubchik's avatar
      MDEV-17070 Table corruption or Assertion `table->file->stats.records > 0 ||... · 69abd437
      Sergei Golubchik authored
      MDEV-17070 Table corruption or Assertion `table->file->stats.records > 0 || error' or Assertion `!is_set() || (m_status == DA_OK_BULK && is_bulk_op())' failed upon actions on temporary table
      
      This was caused by a combination of factors:
      * MyISAM/Aria temporary tables historically never saved the state
        to disk (MYI/MAI), because the state never needed to persist
      * certain ALTER TABLE operations modify the original TABLE structure
        and if they fail, the original table has to be reopened to
        revert all changes (m_needs_reopen=1)
      
      as a result, when ALTER fails and MyISAM/Aria temp table gets reopened,
      it reads the stale state from the disk.
      
      As a fix, MyISAM/Aria tables now *always* write the state to disk
      on close, *unless* HA_EXTRA_PREPARE_FOR_DROP was done first. And
      the server now always does HA_EXTRA_PREPARE_FOR_DROP before dropping
      a temporary table.
      69abd437
    • Sergei Golubchik's avatar
      7025a51a
  7. 11 Mar, 2019 3 commits
    • Alexey Botchkov's avatar
      MDEV-18886 JSON_ARRAY() does not recognise JSON argument. · acb4a872
      Alexey Botchkov authored
      JSON_ARRAY and JSON_OBJECT functions with no arguments now get the
      connection charset. Item_func_convert_charset returns the correct
      is_json() flag.
      acb4a872
    • Sergey Vojtovich's avatar
      ea52ecbc
    • Sergey Vojtovich's avatar
      MDEV-17595 - ALTER TABLE ADD FOREIGN KEY crash · 149b7547
      Sergey Vojtovich authored
      ALTER TABLE ... ADD FOREIGN KEY may trigger assertion failure when
      it has LOCK=EXCLUSIVE clause or concurrent FLUSH TABLES is being
      executed.
      
      In both cases being altered table is marked as flushed, which forces
      subsequent attempt to open parent table to re-open. Which in turn is
      not allowed while transaction is running.
      
      Rather than opening parent table, just take appropriate MDL lock.
      
      Also removed table_already_fk_prelocked() check: MDL itself has much
      better methods to handle duplicate locks. E.g. the former won't acquire
      MDL_SHARED_NO_WRITE if it already has MDL_SHARED_READ.
      149b7547