• unknown's avatar
    WL#3072 - Maria recovery · fc08f82b
    unknown authored
    Unit test for recovery: runs ma_test1 and ma_test2 (both only with
    INSERTs and DELETEs; UPDATEs disabled as not handled by recovery)
    then moves the tables elswhere; recreates tables from the log, and
    compares and fails if there is a difference. Passes now.
    Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used
    for recovery-from-ha_maria.
    Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW.
    Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE,
    UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++),
    UNDO_ROW_DELETE, UNDO_ROW_PURGE.
    Code cleanups.
    Monty: please look for "QQ". Sanja: please look for "Sanja".
    Future tasks: recovery of the bitmap (easy), recovery of the state
    (make it idempotent), more REDOs (Monty to work on
    REDO_UPDATE?), UNDO phase...
    Pushing this cset as it looks safe, contains test and bugfixes which
    will help Monty implement applying of REDO_UPDATE.
    
    
    sql/handler.cc:
      typo
    storage/maria/Makefile.am:
      Adding ma_test_recovery (which ma_test_all invokes, and which can
      also be run alone). Most of maria_read_log.c moved to ma_recovery.c
    storage/maria/ha_maria.cc:
      comments
    storage/maria/ma_bitmap.c:
      fixing comments. 2 -> sizeof(maria_bitmap_marker).
      Bitmap-related part of _ma_initialize_datafile() moves in bitmap module.
      Now putting the "bm" signature when creating the first bitmap page
      (it used to happen only at next open, but that
      caused an annoying difference when testing Recovery if the original
      run didn't open the table, and it looks more
      logical like this: it goes to disk only with its signature correct);
      see the "QQ" comment towards the _ma_initialize_data_file() call
      in ma_create.c for more).
      When reading a bitmap page, verify its signature (happens when normally
      using the table or when CHECKing it; not when REPAIRing it).
    storage/maria/ma_blockrec.c:
      * no need to sync the data file if table is not transactional
      * Comments, code cleanup (log-related data moved to log-related code
      block, int5store->page_store).
      * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we
      do for other records (though this record will soon be replaced
      with a CLR).
      * If "page" is 1 it means the page which extends from byte
      page*block_size+1 to (page+1)*block_size (byte number 1 being
      the first byte of the file). The last byte of the file is
      data_file_length (same convention).
      A new page needs to be created if the last byte of the page is
      beyond the last byte of the file, i.e.
       (page+1)*block_size+1 > data_file_length, so we correct the test
      (bug found when testing log applying for ma_test1 -M -T --skip-update).
      * update the page's LSN when removing a row from it during
      execution of a REDO_PURGE_ROW record (bug found when testing log
      applying for ma_test1 -M -T --skip-update).
      * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now).
    storage/maria/ma_blockrec.h:
      new functions. maria_bitmap_marker does not need to be exported.
    storage/maria/ma_close.c:
      we can always flush the table's state when closing the last instance
      of the table. And it is needed for maria_read_log (as it does
      not use maria_lock_database()).
    storage/maria/ma_control_file.c:
      when in Recovery, some assertions should not be used.
    storage/maria/ma_control_file.h:
      double-inclusion safe
    storage/maria/ma_create.c:
      during recovery, don't log records. Comments.
      Moving the creation of the first bitmap page to ma_bitmap.c
    storage/maria/ma_delete_table.c:
      during recovery, don't log records. Log the end-zero of the dropped
      table's name, so that recovery can use the string in place without
      extending it to fit an end zero.
    storage/maria/ma_loghandler.c:
      * inwrite_rec_hook also needs access to the MARIA_SHARE, like
      prewrite_rec_hook. This will be needed to update
      share->records_diff (in the upcoming patch "recovery of the state").
      * LOG_DESC::record_ends_group changed to an enum.
      * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE
      corrected
      * Sanja please see the @todo LOG BUG
      * avoiding DBUG_RETURN(func()) as it gives confusing debug traces.
    storage/maria/ma_loghandler.h:
      - log write hooks called while the log's lock is held (inwrite_rec_hook)
      now need the MARIA_SHARE, like prewrite_rec_hook already had
      - instead of a bool saying if this record's type ends groups or not,
      we refine: it may not end a group, it may end a group, or it may
      be a group in itself. Imagine that we had a physical write failure
      to a table before we log the UNDO, we still end up in
      external_lock(F_UNLCK) and then we log a COMMIT: we don't want
      to consider this COMMIT as ending the group of REDOs (don't want
      to execute those REDOs during Recovery), that's why we say "COMMIT
      is a group in itself, it aborts any previous group". This also
      gives one more sanity check in maria_read_log.
    storage/maria/ma_recovery.c:
      New Recovery code, replacing the old pseudocode.
      Most of maria_read_log moved here.
      Call-able from ha_maria, but not enabled yet.
      Compared to the previous version of maria_read_log, some bugs have
      been fixed, debugging output can go to stdout or a disk file (for now
      it's useful for me, later it can be changed), execution of
      REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code
      has been factored into functions. We abort an unfinished group
      of records if we see a record which is a group in itself (like COMMIT).
      No need for maria_panic() after a bug (which caused tables to not
      be closed) was fixed; if there is yet another bug I prefer to see it.
      When opening a table for Recovery, set data_file_length
      and key_file_length to their real physical value (these are the
      easiest state members to restore :). Warn us if the last page
      was truncated (but Recovery handles it).
      MARIA_SHARE::state::state::records is now partly recovered (not
      idempotent, but works if recreating tables from scracth).
      When applying a REDO to a page, stamp it with the UNDO's LSN
      (current_group_end_lsn), not with the REDO's LSN; it makes
      the table more identical to the original table (easier to compare
      the two tables in the end).
      Big thing missing: some types of REDOs are not handled,
      and the UNDO phase does not exist (missing functions to execute UNDOs
      to actually rollback). So for now tests are only inserting/deleting
      a few 100 rows, closing the table and seeing if the log is applied ok;
      it works. UPDATE not handled.
    storage/maria/ma_recovery.h:
      new functions: ma_recover() for recovery from inside ha_maria;
      _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()).
      Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore.
    storage/maria/ma_rename.c:
      don't write log records during recovery
    storage/maria/ma_test2.c:
      - fail if maria_info() or other subtests find some wrong information
      - new option -g to skip updates.
      - init the translog before creating the table, so that log applying
      can work.
      - in "#if 0" you'll see some fixed bugs (will be removed).
    storage/maria/ma_test_all.sh:
      cleanup files. Test log applying.
    storage/maria/maria_read_log.c:
      most of the logic moves to ma_recovery.c to be shared between
      maria_read_log and recovery-from-inside-mysqld.
      See ma_recovery.c for additional changes made to the moved code.
    storage/maria/ma_test_recovery:
      unit test for Recovery. Tests insert and delete,
      REDO_UPDATE not yet coded.
      Script is called from ma_test_all. Can run standalone.
    fc08f82b
ma_delete_table.c 3.67 KB