• unknown's avatar
    - speed optimization: · f456b30c
    unknown authored
    minimize writes to transactional Maria tables: don't write
    data pages, state, and open_count at the end of each statement.
    Data pages will be written by a background thread periodically.
    State will be written by Checkpoint periodically.
    open_count serves to detect when a table is potentially damaged
    due to an unclean mysqld stop, but thanks to recovery an unclean
    mysqld stop will be corrected and so open_count becomes useless.
    As state is written less often, it is often obsolete on disk,
    we thus should avoid to read it from disk.
    - by removing the data page writes above, it is necessary to put
    it back at the start of some statements like check, repair and
    delete_all. It was already necessary in fact (see ma_delete_all.c).
    - disabling CACHE INDEX on Maria tables for now (fixes crash
    of test 'key_cache' when run with --default-storage-engine=maria).
    - correcting some fishy code in maria_extra.c (we possibly could lose
    index pages when doing a DROP TABLE under Windows, in theory).
    
    
    storage/maria/ha_maria.cc:
      disable CACHE INDEX in Maria for now (there is a single cache for now),
      it crashes and it's not a priority
    storage/maria/ma_bitmap.c:
      debug message
    storage/maria/ma_check.c:
      The statement before maria_repair() may not flush state,
      so it needs to be done by maria_repair() (indeed this function
      uses maria_open(HA_OPEN_COPY) so reads state from disk,
      so needs to find it up-to-date on disk).
      For safety (but normally this is not needed) we remove index blocks
      out of the cache before repairing.
      _ma_flush_blocks() becomes _ma_flush_table_files_after_repair():
      it now additionally flushes the data file and state and syncs files.
      As a side effect, the assertion "no WRITE_CACHE_USED" from
      _ma_flush_table_files() fired so we move all end_io_cache() done
      at the end of repair to before the calls to _ma_flush_table_files_after_repair().
    storage/maria/ma_close.c:
      when closing a transactional table, we fsync it. But we need to
      do this only after writing its state.
      We need to write the state at close time only for transactional
      tables (the other tables do that at last unlock).
      Putting back the O_RDONLY||crashed condition which I had
      removed earlier.
      Unmap the file before syncing it (does not matter now as Maria
      does not use mmap)
    storage/maria/ma_delete_all.c:
      need to flush data pages before chsize-ing it. Was needed even when
      we flushed data pages at the end of each statement, because we didn't
      anyway do it if under LOCK TABLES: the change here thus fixes this bug:
      create table t(a int) engine=maria;lock tables t write;
      insert into t values(1);delete from t;unlock tables;check table t;
      "Size of datafile is: 16384       Should be: 8192"
      (an obsolete page went to disk after the chsize(), at unlock time).
    storage/maria/ma_extra.c:
      When doing share->last_version=0, we make the MARIA_SHARE-in-memory
      invisible to future openers, so need to have an up-to-date state
      on disk for them. The same way, future openers will reopen the data
      and index file, so they will not find our cached blocks, so we
      need to flush them to disk.
      In HA_EXTRA_FORCE_REOPEN, this probably happens naturally as all
      tables normally get closed, we however add a safety flush.
      In HA_EXTRA_PREPARE_FOR_RENAME, we need to do the flushing. On
      Windows we additionally need to close files.
      In HA_EXTRA_PREPARE_FOR_DROP, we don't need to flush anything but
      remove dirty cached blocks from memory. On Windows we need to close
      files.
      Closing files forces us to sync them before (requirement for transactional
      tables).
      For mutex reasons (don't lock intern_lock twice), we move
      maria_lock_database() and _ma_decrement_open_count() first in the list
      of operations.
      Flush also data file in HA_EXTRA_FLUSH.
    storage/maria/ma_locking.c:
      For transactional tables:
        - don't write data pages / state at unlock time;
        as a consequence, "share->changed=0" cannot be done.
        - don't write state in _ma_writeinfo()
        - don't maintain open_count on disk (Recovery corrects the table in case of crash
        anyway, and we gain speed by not writing open_count to disk),
      For non-transactional tables, flush the state at unlock only
      if the table was changed (optimization).
      Code which read the state from disk is relevant only with
      external locking, we disable it (if want to re-enable it, it shouldn't
      for transactional tables as state on disk may be obsolete (such tables
      does not flush state at unlock anymore).
      The comment "We have to flush the write cache" is now wrong because
      maria_lock_database(F_UNLCK) now happens before thr_unlock(), and
      we are not using external locking.
    storage/maria/ma_open.c:
      _ma_state_info_read() is only used in ma_open.c, making it static
    storage/maria/ma_recovery.c:
      set MARIA_SHARE::changed to TRUE when we are going to apply a
      REDO/UNDO, so that the state gets flushed at close.
    storage/maria/ma_test_recovery.expected:
      Changes introduced by this patch:
      - good: the "open" (table open, not properly closed) is gone,
      it was pointless for a recovered table
      - bad: stemming from different moments of writing the index's state
      probably (_ma_writeinfo() used to write the state after every row
      write in ma_test* programs, doesn't anymore as the table is
      transactional): some differences in indexes (not relevant as we don't
      yet have recovery for them); some differences in count of records
      (changed from a wrong value to another wrong value) (not relevant
      as we don't recover this count correctly yet anyway, though
      a patch will be pushed soon).
    storage/maria/ma_test_recovery:
      for repeatable output, no names of varying directories.
    storage/maria/maria_chk.c:
      function renamed
    storage/maria/maria_def.h:
      Function became local to ma_open.c. Function renamed.
    f456b30c
ma_recovery.c 54.2 KB