1. 12 Sep, 2007 5 commits
    • unknown's avatar
      MY_ALLOW_ZERO_PTR in my_realloc() to fix safemalloc errors in pushbuild · 8d676e4a
      unknown authored
      
      storage/maria/ma_recovery.c:
        MY_ALLOW_ZERO_PTR needed as log_record_buffer.str is initially NULL.
      8d676e4a
    • unknown's avatar
      fix for pushbuild test failures · 0b06da5f
      unknown authored
      
      mysql-test/r/rpl_row_flsh_tbls.result:
        result update
      mysql-test/r/rpl_row_insert_delayed.result:
        result update
      mysql-test/t/rpl_row_flsh_tbls.test:
        CREATE TABLE statement got an ENGINE clause so became longer
      0b06da5f
    • unknown's avatar
      fix for pushbuild test failure (my_realloc() failed => checkpoint · 6b9c2a75
      unknown authored
      failed => Maria didn't start => tables were created as MyISAM).
      
      
      storage/maria/ma_checkpoint.c:
        safemalloc complains if my_realloc() is passed NULL and
        MY_ALLOW_ZERO_PTR is not used.
      6b9c2a75
    • unknown's avatar
      WL#3072 Maria Recovery · 9df34bd6
      unknown authored
      * added replaying of REDO_REPAIR_TABLE, but disabled it as
      mysterious linker errors appear.
      * after replaying RENAME/REPAIR, we must bump create_rename_lsn
      for idempotency of maria_read_log.
      
      
      sql/mysqld.cc:
        typo
      storage/maria/ma_checkpoint.c:
        silence compiler warning
      storage/maria/ma_recovery.c:
        * added replaying of REDO_REPAIR_TABLE, but disabled it as
        mysterious linker errors appear.
        * after replaying RENAME/REPAIR, we must bump create_rename_lsn
        for idempotency of maria_read_log.
      9df34bd6
    • unknown's avatar
      WL#3071 Maria checkpoint · cd275413
      unknown authored
      Finally this is the real checkpoint code.
      It however exhibits unstabilities when a checkpoint runs concurrently
      with data-modifying clients (table corruption, transaction log's
      assertions) so for now a checkpoint is taken only at startup after
      recovery and at shutdown, i.e. not in concurrent situations. Later
      we will let it run periodically, as well as flush dirty pages
      periodically (almost all needed code is there already, only pagecache
      code is written but not committed).
      WL#3072 Maria recovery
      * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via
      ma_test2 which has INSERTs failing with duplicate keys.
      * replaying of REDO_RENAME_TABLE
      Now, off to test Recovery in ha_maria :)
      
      
      BitKeeper/deleted/.del-ma_least_recently_dirtied.c:
        Delete: storage/maria/ma_least_recently_dirtied.c
      BitKeeper/deleted/.del-ma_least_recently_dirtied.h:
        Delete: storage/maria/ma_least_recently_dirtied.h
      storage/maria/Makefile.am:
        compile Checkpoint module
      storage/maria/ha_maria.cc:
        When ha_maria starts, do a recovery from last checkpoint.
        Take a checkpoint when that recovery has ended and when ha_maria
        shuts down cleanly.
      storage/maria/ma_blockrec.c:
        * even if my_sync() fails we have to my_close() (otherwise we leak
        a descriptor)
        * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT,
        as promised in the old comment; it gives us skipping during the
        UNDO phase.
      storage/maria/ma_check.c:
        All REDOs before create_rename_lsn are ignored by Recovery. So
        create_rename_lsn must be set only after all data/index has been
        flushed and forced to disk. We thus move write_log_record_for_repair()
        to after _ma_flush_tables_files_after_repair().
      storage/maria/ma_checkpoint.c:
        Checkpoint module.
      storage/maria/ma_checkpoint.h:
        optional argument if caller wants a thread to periodically take
        checkpoints and flush dirty pages.
      storage/maria/ma_create.c:
        * no need to init some vars as the initial bzero(share) takes care of this.
        * update to new function's name
        * even if we fail in my_sync() we have to my_close()
      storage/maria/ma_extra.c:
        Checkpoint reads share->last_version under intern_lock, so we make
        maria_extra() update it under intern_lock. THR_LOCK_maria still needed
        because of _ma_test_if_reopen().
      storage/maria/ma_init.c:
        destroy checkpoint module when Maria shuts down.
      storage/maria/ma_loghandler.c:
        * UNDO_ROW_PURGE gone (see ma_blockrec.c)
        * we need to remember the LSN of the LOGREC_FILE_ID for a share,
        because this LSN is needed into the checkpoint record (Recovery wants
        to know the validity domain of an id->name mapping)
        * translog_get_horizon_no_lock() needed for Checkpoint
        * comment about failing assertion (Sanja knows)
        * translog_init_reader_data() thought that translog_read_record_header_scan()
        returns 0 in case of error, but 0 just means "0-length header".
        * translog_assign_id_to_share() now needs the MARIA_HA because
        LOGREC_FILE_ID uses a log-write hook.
        * Verify that (de)assignment of share->id happens only under intern_lock,
        as Checkpoint reads this id with intern_lock.
        * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily
        a real LSN.
      storage/maria/ma_loghandler.h:
        prototype updates
      storage/maria/ma_open.c:
        no need to initialize "res"
      storage/maria/ma_pagecache.c:
        When taking a checkpoint, we don't need to know the maximum rec_lsn
        of dirty pages; this LSN was intended to be used in the two-checkpoint
        rule, but last_checkpoint_lsn is as good.
        4 bytes for stored_list_size is enough as PAGECACHE::blocks (number
        of blocks which the pagecache can contain) is int.
      storage/maria/ma_pagecache.h:
        new prototype
      storage/maria/ma_recovery.c:
        * added replaying of REDO_RENAME_TABLE
        * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END
        * Recovery from the last checkpoint record now possible
        * In new_table() we skip the table if the id->name mapping is older than
        create_rename_lsn (mapping dates from lsn_of_file_id).
        * in get_MARIA_HA_from_REDO_record() we skip the record
        if the id->name mapping is newer than the record (can happen if processing
        a record which is before the checkpoint record).
        * parse_checkpoint_record() has to return a LSN, that's what caller expects
      storage/maria/ma_rename.c:
        new function's name; log end zeroes of tables' names (ease recovery)
      storage/maria/ma_test2.c:
        * equivalent of ma_test1's --test-undo added (named -u here).
        * -t=1 now stops right after creating the table, so that
        we can test undoing of INSERTs with duplicate keys (which tests the
        CLR_END logged by _ma_write_abort_block_record()).
      storage/maria/ma_test_recovery.expected:
        Result of testing undoing of INSERTs with duplicate keys; there are
        some differences in maria_chk -dvv but they are normal (removing
        records does not shrink data/index file, does not put back the
        "analyzed, optimized keys"(etc) index state.
      storage/maria/ma_test_recovery:
        Test undoing of INSERTs with duplicate keys, using ma_test2;
        when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE,
        CLR_END; we abort after that, and test that CLR_END causes recovery
        to jump over UNDO_INSERT.
      storage/maria/ma_write.c:
        comment
      storage/maria/maria_chk.c:
        comment
      storage/maria/maria_def.h:
        * a new bit in MARIA_SHARE::in_checkpoint, used to build a list
        of unique shares during Checkpoint.
        * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID
        for this share; needed to know to which LSN domain the mappings
        found in the Checkpoint record apply (new mappings should not apply
        to old REDOs).
      storage/maria/trnman.c:
        * small changes to how trnman_collect_transactions() fills its buffer;
        it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
      cd275413
  2. 11 Sep, 2007 3 commits
    • unknown's avatar
      Merge gbichot@bk-internal.mysql.com:/home/bk/mysql-maria · cbac9b2c
      unknown authored
      into  gbichot4.local:/home/mysql_src/mysql-maria-no-flush-state
      
      cbac9b2c
    • unknown's avatar
      WL#3072 Maria recovery · b2293fb6
      unknown authored
      * testing of execution of UNDO_ROW_UPDATE
      * when executing an UNDO_ROW_UPDATE, store "UNDO_ROW_UPDATE" as
      "type of undone record" into the CLR_END record.
      
      
      storage/maria/ma_blockrec.c:
        When logging a CLR_END in write_block_record(), it can be for
        a DELETE or for an UPDATE (now that Monty has coded execution of
        UNDO_UPDATE)
      storage/maria/ma_loghandler.c:
        UNDO_ROW_UPDATE's execution coded, so no crash
      storage/maria/ma_recovery.c:
        UNDO_ROW_UPDATE's execution now coded, so no crash
      storage/maria/ma_test1.c:
        upper case letter
      storage/maria/ma_test_recovery.expected:
        output of testing execution of UNDO_ROW_UPDATE. Table's checksum
        not recovered (known issue not specific to UPDATE).
      storage/maria/ma_test_recovery:
        Test execution of UNDO_ROW_UPDATE: first we stop ma_test1 after
        deletes and commit, then we stop ma_test1 after updates and abort;
        we verify that updates are rolled back by comparing tables
      b2293fb6
    • unknown's avatar
      Absence of test_file.h fixed. · 5e7bb6c6
      unknown authored
      5e7bb6c6
  3. 10 Sep, 2007 5 commits
    • unknown's avatar
      Merge bk-internal.mysql.com:/home/bk/mysql-maria · 1f1522bd
      unknown authored
      into  mysql.com:/home/my/mysql-maria
      
      
      storage/maria/maria_read_log.c:
        Auto merged
      1f1522bd
    • unknown's avatar
      Fixed some bugs when using undo of VARCHAR fields · f83b6b30
      unknown authored
      Fixed bug in undo_delete
      Fixed wrong error output from maria_check
      
      
      include/my_base.h:
        Added marker if we have null fields in table
      mysql-test/r/maria.result:
        checksum in maria now ignore null fields that are null
      sql/sql_table.cc:
        Ignore null fields that are now
        (Before enabling this, we have to change MyISAM to also skip null fields)
      storage/maria/ma_blockrec.c:
        More logging
        After merge fixes
        Fixed some bugs when using undo of VARCHAR fields
        Fixed bug in undo_delete (We can't use info->rec_buff here as this is used in write_block_record())
      storage/maria/ma_blockrec.h:
        ma_recordpos_to_dir_entry changed to return uint
      storage/maria/ma_check.c:
        Fixed wrong output in case of errors
      storage/maria/ma_create.c:
        Set share.base.pack_reclength more correct for block record
        Delete support for RAID
      storage/maria/ma_open.c:
        Don't calculate checksum fields with value NULL
      storage/maria/ma_test1.c:
        Fixed output from -v for VARCHAR keys
      storage/maria/ma_test_recovery.expected:
        Update results after adding new printf
        New checksums (because we now ignore nulls)
        Some file lengths are different, but think they are ok (didn't have time to investigate)
      storage/myisam/ha_myisam.cc:
        Fixed comment
      storage/myisam/mi_test1.c:
        Fixed bug
      f83b6b30
    • unknown's avatar
      fix for pushbuild failure, include trnman_public.h in source tarball (make dist) · 15f241f4
      unknown authored
      
      storage/maria/Makefile.am:
        include trnman_public.h in source tarball
      15f241f4
    • unknown's avatar
      fix a typo in #ifdef · 2ce9a7ac
      unknown authored
      2ce9a7ac
    • unknown's avatar
      include maria in pushbuild's 'make dist' · f9d2d768
      unknown authored
      f9d2d768
  4. 09 Sep, 2007 2 commits
    • unknown's avatar
      Merge bk-internal.mysql.com:/home/bk/mysql-maria · f6ca17e8
      unknown authored
      into  mysql.com:/home/my/mysql-maria
      
      
      storage/maria/ma_check.c:
        Auto merged
      storage/maria/ma_locking.c:
        Auto merged
      storage/maria/ma_loghandler.c:
        Auto merged
      storage/maria/ma_open.c:
        Auto merged
      storage/maria/ma_recovery.c:
        Auto merged
      storage/maria/maria_def.h:
        Auto merged
      storage/maria/maria_read_log.c:
        Auto merged
      storage/maria/ma_blockrec.c:
        Manual merge
      storage/maria/ma_test1.c:
        Manual merge (using Guilhems code)
      f6ca17e8
    • unknown's avatar
      Added applying of undo for updates · ba1b1a5c
      unknown authored
      Fixed bug in duplicate key handling for block records during repair
      All read-row methods now return error number in case of error
      Don't calculate checksum for null fields
      Fixed bug when running maria_read_log with -o
      
      
      BUILD/SETUP.sh:
        Added STACK_DIRECTION
      BUILD/compile-pentium-debug-max:
        Moved STACK_DIRECTION to SETUP
      include/myisam.h:
        Added extra parameter to write_key
      storage/maria/ma_blockrec.c:
        Added applying of undo for updates
        Fixed indentation
        Removed some not needed casts
        Fixed wrong logging of CLR record
        Split ma_update_block_record to two functions to be able to reuse it from undo-applying
        Simplify filling of packed fields
        ma_record_block_record) now returns error number on failure
        Sligtly changed log record information for undo-update
      storage/maria/ma_check.c:
        Fixed bug in duplicate key handling for block records during repair
      storage/maria/ma_checksum.c:
        Don't calculate checksum for null fields
      storage/maria/ma_dynrec.c:
        _ma_read_dynamic_reocrd() now returns error number on error
        Rest of the changes are code simplification and indentation fixes
      storage/maria/ma_locking.c:
        Added comment
      storage/maria/ma_loghandler.c:
        More debugging
        Removed printing of total_record_length as this was always same as record_length
      storage/maria/ma_open.c:
        Allocate bitmap for changed fields
      storage/maria/ma_packrec.c:
        read_record now returns error number on error
      storage/maria/ma_recovery.c:
        Fixed wrong arguments to undo_row_update
      storage/maria/ma_statrec.c:
        read_record now returns error number on error (not 1)
        Code simplification
      storage/maria/ma_test1.c:
        Added exit possibility after update phase (to test undo of updates)
      storage/maria/maria_def.h:
        Include bitmap header file
      storage/maria/maria_read_log.c:
        Fixed bug when running with -o
      ba1b1a5c
  5. 07 Sep, 2007 5 commits
    • unknown's avatar
      enable --with-maria-storage-engine · c66d41d8
      unknown authored
      c66d41d8
    • unknown's avatar
      Fix for pushbuild maria.test failure, where directory syncing failed at the · 85cef3d9
      unknown authored
      end of translog_flush() when datadir was in /dev/shm.
      
      
      storage/maria/ma_loghandler.c:
        directory syncing can fail on shared memory devices (/dev/shm on Linux
        in this case); see my_sync_dir().
      85cef3d9
    • unknown's avatar
      If Maria engine is not compiled in, don't use page caches (fix · 79b0729d
      unknown authored
      for compiler errors in pushbuild). Small bugfix.
      
      
      sql/handler.h:
        don't use pagecaches if no Maria
      storage/maria/ma_check.c:
        correcting mistake in previous push; need to call this function
        otherwise create_rename_lsn would not be updated at end of REPAIR.
      79b0729d
    • unknown's avatar
      WL#3072 - Maria Recovery · 0ff7f3e1
      unknown authored
      At the end of recovery, we initialize the transaction manager's
      trid generator with the maximum trid seen during the REDO phase.
      This ensures that trids always grow (needed for versioning),
      even after a crash.
      This patch is only preparation, as ma_recover() is not called
      from ha_maria yet.
      
      
      storage/maria/ha_maria.cc:
        trnman_init() needs argument now (soon trnman_init() will rather
        be done via ma_recover() and thus it will not be 0)
      storage/maria/ma_recovery.c:
        During the REDO phase, remember the max long trid of transactions
        which we have seen (both in the checkpoint record and the
        LOGREC_LONG_TRANSACTION_ID records)
      storage/maria/ma_test1.c:
        trnman_init() needs argument now
      storage/maria/ma_test2.c:
        trnman_init() needs argument now
      storage/maria/trnman.c:
        new argument to trnman_init() so that caller can decide which
        value the generator of trids starts from.
      storage/maria/trnman_public.h:
        trnman_init() needs argument now
      storage/maria/unittest/trnman-t.c:
        trnman_init() needs argument now
      0ff7f3e1
    • unknown's avatar
      - WL#3072 Maria Recovery: · 753ee49f
      unknown authored
      Recovery of state.records (the count of records which is stored into
      the header of the index file). For that, state.is_of_lsn is introduced;
      logic is explained in ma_recovery.c (look for "Recovery of the state").
      The net gain is that in case of crash, we now recover state.records,
      and it is idempotent (ma_test_recovery tests it).
      state.checksum is not recovered yet, mail sent for discussion.
      - WL#3071 Maria Checkpoint: preparation for it, by protecting
      all modifications of the state in memory or on disk with intern_lock
      (with the exception of the really-often-modified state.records,
      which is now protected with the log's lock, see ma_recovery.c
      (look for "Recovery of the state"). Also, if maria_close() sees that
      Checkpoint is looking at this table it will not my_free() the share.
      - don't compute row's checksum twice in case of UPDATE (correction
      to a bugfix I made yesterday).
      
      
      storage/maria/ha_maria.cc:
        protect state write with intern_lock (against Checkpoint)
      storage/maria/ma_blockrec.c:
        * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it
        should wait until we have corrected the allocation in the bitmap
        (as the REDO can serve to correct the allocation during Recovery);
        introducing _ma_finalize_row() for that.
        * In a changeset yesterday I moved computation of the checksum
        into write_block_record(), to fix a bug in UPDATE. Now I notice
        that maria_update() already computes the checksum, it's just that
        it puts it into info->cur_row while _ma_update_block_record()
        uses info->new_row; so, removing the checksum computation from
        write_block_record(), putting it back into allocate_and_write_block_record()
        (which is called only by INSERT and UNDO_DELETE), and copying
        cur_row->checksum into new_row->checksum in _ma_update_block_record().
      storage/maria/ma_check.c:
        new prototypes, they will take intern_lock when writing the state;
        also take intern_lock when changing share->kfile. In both cases
        this is to protect against Checkpoint reading/writing the state or reading
        kfile at the same time.
        Not updating create_rename_lsn directly at end of write_log_record_for_repair()
        as it wouldn't have intern_lock.
      storage/maria/ma_close.c:
        Checkpoint builds a list of shares (under THR_LOCK_maria), then it
        handles each such share (under intern_lock) (doing flushing etc);
        if maria_close() freed this share between the two, Checkpoint
        would see a bad pointer. To avoid this, when building the list Checkpoint
        marks each share, so that maria_close() knows it should not free it
        and Checkpoint will free it itself.
        Extending the zone covered by intern_lock to protect against
        Checkpoint reading kfile, writing state.
      storage/maria/ma_create.c:
        When we update create_rename_lsn, we also update is_of_lsn to
        the same value: it is logical, and allows us to test in maria_open()
        that the former is not bigger than the latter (the contrary is a sign
        of index header corruption, or severe logging bug which hinders
        Recovery, table needs a repair).
        _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn;
        it now operates under intern_lock (protect against Checkpoint),
        a shortcut function is available for cases where acquiring
        intern_lock is not needed (table's creation or first open).
      storage/maria/ma_delete.c:
        if table is transactional, "records" is already decremented
        when logging UNDO_ROW_DELETE.
      storage/maria/ma_delete_all.c:
        comments
      storage/maria/ma_extra.c:
        Protect modifications of the state, in memory and/or on disk,
        with intern_lock, against a concurrent Checkpoint.
        When state goes to disk, update it's is_of_lsn (by calling
        the new _ma_state_info_write()).
        In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing
        a change I made a few days ago) and ASK_MONTY
      storage/maria/ma_locking.c:
        no real code change here.
      storage/maria/ma_loghandler.c:
        Log-write-hooks for updating "state.records" under log's mutex
        when writing/updating/deleting a row or deleting all rows.
      storage/maria/ma_loghandler_lsn.h:
        merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different)
      storage/maria/ma_open.c:
        When opening a table verify that is_of_lsn >= create_rename_lsn; if
        false the header must be corrupted.
        _ma_state_info_write() is split in two: _ma_state_info_write_sub()
        which is the old _ma_state_info_write(), and _ma_state_info_write()
        which additionally takes intern_lock if requested (to protect
        against Checkpoint) and updates is_of_lsn.
        _ma_open_keyfile() should change kfile.file under intern_lock
        to protect Checkpoint from reading a wrong kfile.file.
      storage/maria/ma_recovery.c:
        Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT
        which has a LSN > state.is_of_lsn it increments state.records.
        Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE.
        When closing a table during Recovery, we know its state is at least
        as new as the current log record we are looking at, so increase
        is_of_lsn to the LSN of the current log record.
      storage/maria/ma_rename.c:
        update for new behaviour of _ma_update_create_rename_lsn_on_disk().
      storage/maria/ma_test1.c:
        update to new prototype
      storage/maria/ma_test2.c:
        update to new prototype (actually prototype was changed days ago,
        but compiler does not complain about the extra argument??)
      storage/maria/ma_test_recovery.expected:
        new result file of ma_test_recovery. Improvements: record
        count read from index's header is now always correct.
      storage/maria/ma_test_recovery:
        "rm" fails if file does not exist. Redirect stderr of script.
      storage/maria/ma_write.c:
        if table is transactional, "records" is already incremented when
        logging UNDO_ROW_INSERT. Comments.
      storage/maria/maria_chk.c:
        update is_of_lsn too
      storage/maria/maria_def.h:
        - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored
        into the index file's header.
        - Checkpoint can now mark a table as "don't free this", and maria_close()
        can reply "ok then you will free it".
        - new functions
      storage/maria/maria_pack.c:
        update for new name
      753ee49f
  6. 06 Sep, 2007 2 commits
    • unknown's avatar
      - speed optimization: · f456b30c
      unknown authored
      minimize writes to transactional Maria tables: don't write
      data pages, state, and open_count at the end of each statement.
      Data pages will be written by a background thread periodically.
      State will be written by Checkpoint periodically.
      open_count serves to detect when a table is potentially damaged
      due to an unclean mysqld stop, but thanks to recovery an unclean
      mysqld stop will be corrected and so open_count becomes useless.
      As state is written less often, it is often obsolete on disk,
      we thus should avoid to read it from disk.
      - by removing the data page writes above, it is necessary to put
      it back at the start of some statements like check, repair and
      delete_all. It was already necessary in fact (see ma_delete_all.c).
      - disabling CACHE INDEX on Maria tables for now (fixes crash
      of test 'key_cache' when run with --default-storage-engine=maria).
      - correcting some fishy code in maria_extra.c (we possibly could lose
      index pages when doing a DROP TABLE under Windows, in theory).
      
      
      storage/maria/ha_maria.cc:
        disable CACHE INDEX in Maria for now (there is a single cache for now),
        it crashes and it's not a priority
      storage/maria/ma_bitmap.c:
        debug message
      storage/maria/ma_check.c:
        The statement before maria_repair() may not flush state,
        so it needs to be done by maria_repair() (indeed this function
        uses maria_open(HA_OPEN_COPY) so reads state from disk,
        so needs to find it up-to-date on disk).
        For safety (but normally this is not needed) we remove index blocks
        out of the cache before repairing.
        _ma_flush_blocks() becomes _ma_flush_table_files_after_repair():
        it now additionally flushes the data file and state and syncs files.
        As a side effect, the assertion "no WRITE_CACHE_USED" from
        _ma_flush_table_files() fired so we move all end_io_cache() done
        at the end of repair to before the calls to _ma_flush_table_files_after_repair().
      storage/maria/ma_close.c:
        when closing a transactional table, we fsync it. But we need to
        do this only after writing its state.
        We need to write the state at close time only for transactional
        tables (the other tables do that at last unlock).
        Putting back the O_RDONLY||crashed condition which I had
        removed earlier.
        Unmap the file before syncing it (does not matter now as Maria
        does not use mmap)
      storage/maria/ma_delete_all.c:
        need to flush data pages before chsize-ing it. Was needed even when
        we flushed data pages at the end of each statement, because we didn't
        anyway do it if under LOCK TABLES: the change here thus fixes this bug:
        create table t(a int) engine=maria;lock tables t write;
        insert into t values(1);delete from t;unlock tables;check table t;
        "Size of datafile is: 16384       Should be: 8192"
        (an obsolete page went to disk after the chsize(), at unlock time).
      storage/maria/ma_extra.c:
        When doing share->last_version=0, we make the MARIA_SHARE-in-memory
        invisible to future openers, so need to have an up-to-date state
        on disk for them. The same way, future openers will reopen the data
        and index file, so they will not find our cached blocks, so we
        need to flush them to disk.
        In HA_EXTRA_FORCE_REOPEN, this probably happens naturally as all
        tables normally get closed, we however add a safety flush.
        In HA_EXTRA_PREPARE_FOR_RENAME, we need to do the flushing. On
        Windows we additionally need to close files.
        In HA_EXTRA_PREPARE_FOR_DROP, we don't need to flush anything but
        remove dirty cached blocks from memory. On Windows we need to close
        files.
        Closing files forces us to sync them before (requirement for transactional
        tables).
        For mutex reasons (don't lock intern_lock twice), we move
        maria_lock_database() and _ma_decrement_open_count() first in the list
        of operations.
        Flush also data file in HA_EXTRA_FLUSH.
      storage/maria/ma_locking.c:
        For transactional tables:
          - don't write data pages / state at unlock time;
          as a consequence, "share->changed=0" cannot be done.
          - don't write state in _ma_writeinfo()
          - don't maintain open_count on disk (Recovery corrects the table in case of crash
          anyway, and we gain speed by not writing open_count to disk),
        For non-transactional tables, flush the state at unlock only
        if the table was changed (optimization).
        Code which read the state from disk is relevant only with
        external locking, we disable it (if want to re-enable it, it shouldn't
        for transactional tables as state on disk may be obsolete (such tables
        does not flush state at unlock anymore).
        The comment "We have to flush the write cache" is now wrong because
        maria_lock_database(F_UNLCK) now happens before thr_unlock(), and
        we are not using external locking.
      storage/maria/ma_open.c:
        _ma_state_info_read() is only used in ma_open.c, making it static
      storage/maria/ma_recovery.c:
        set MARIA_SHARE::changed to TRUE when we are going to apply a
        REDO/UNDO, so that the state gets flushed at close.
      storage/maria/ma_test_recovery.expected:
        Changes introduced by this patch:
        - good: the "open" (table open, not properly closed) is gone,
        it was pointless for a recovered table
        - bad: stemming from different moments of writing the index's state
        probably (_ma_writeinfo() used to write the state after every row
        write in ma_test* programs, doesn't anymore as the table is
        transactional): some differences in indexes (not relevant as we don't
        yet have recovery for them); some differences in count of records
        (changed from a wrong value to another wrong value) (not relevant
        as we don't recover this count correctly yet anyway, though
        a patch will be pushed soon).
      storage/maria/ma_test_recovery:
        for repeatable output, no names of varying directories.
      storage/maria/maria_chk.c:
        function renamed
      storage/maria/maria_def.h:
        Function became local to ma_open.c. Function renamed.
      f456b30c
    • unknown's avatar
      WL#3072 Maria Recovery · d1886afc
      unknown authored
      misc fixes of execution of UNDOs in the UNDO phase:
      - into the CLR_END, store the LSN of the _previous_ UNDO (we debated
      what was best, so far we're going with "previous"; later we can change
      to "current" if needed), and store the type of record which is being
      undone (needed to know how to update state.records when we see the
      CLR_END during the REDO phase).
      - declaring all UNDOs and CLR_END as "compressed"
      - when executing an UNDO in the UNDO phase, state.records is updated
      as a hook when writing CLR_END (needed for "recovery of the state"),
      and so is trn->undo_lsn (needed for when we have checkpoints).
      - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum
      into the re-inserted row, maria_chk -r thus threw the row away).
      - modifications of ma_test1: where to stop is now driven by --testflag;
      --test-undo just tells how to stop (flush data, flush log, nothing).
      - ma_test_recovery: testing of the UNDO phase, more testing of the
      REDO phase, identification of a bug.
      
      
      storage/maria/ma_blockrec.c:
        - bugfix: execution of UNDO_ROW_DELETE didn't store the correct
        checksum into the row (leading to "maria_chk -r" eliminating the
        re-inserted row, net effect was that rollback appeared to have
        rolled back no deletion). Reason was that write_block_record() used
        info->cur_row.checksum, while "row" can be != &info->cur_row
        (case of UNDO_ROW_DELETE). After fixing this, problems with
        _ma_update_block_record() appeared; indeed checksum was computed
        by  allocate_and_write_block_record() while _ma_update_block_record()
        directly calls write_block_record(). Solution is to compute checksum
        in write_block_record() instead.
        - when executing an UNDO, we now pass the LSN of the _previous_ UNDO
        to block_format functions. This LSN can be 0 (if the being-executed UNDO
        was the transaction's first UNDO), so "undo_lsn==0" cannot work
        anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR
        instead (this is an impossible LSN).
        - store into CLR_END the type of log record which was undone
        (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has
        to update state.records if it sees this CLR_END in the REDO phase.
        - when writing the CLR_END in _ma_apply_undo_row_insert(),
        the place to store file's id is log_data+LSN_STORE_SIZE.
        - in _ma_apply_undo_row_insert(), the records-- is moved
        to a hook when writing the CLR_END (this way it is under log's mutex
        which is needed for "recovery of the state")
      storage/maria/ma_loghandler.c:
        - all UNDOs, and CLR_END, start with the LSN of another UNDO; so
        we can declare them "compressed".
        - write_hook_for_clr_end() to set trn->undo_lsn (to the previous
        UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's
        lock), and also update, if appropriate, state.records.
        - reset share->id to 0 when deassigning; not useful for now but
        sounds logical.
      storage/maria/ma_recovery.c:
        - if no table is found for a REDO, it's not an error; for an UNDO, it is
        - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn
        and sometimes state.records.
        - in the UNDO phase, when we execute an UNDO_ROW_INSERT:
          * update trn->undo_lsn only after executing the record
          * store the _previous_ undo_lsn into the CLR_END
        - at the end of the REDO phase, when we recreate TRN objects, they
        have already their long id in the log (either via a
        LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write
        a new, useless LOGREC_LONG_TRANSACTION_ID for them.
      storage/maria/ma_test1.c:
        * where to stop execution is now driven by --testflag and not --test-undo
        (ma_test2 already has --testflag for the same purpose). This allows
        us to do a clean stop (with commit) at any point.
        * --test-undo=# tells how to abort (flush all pages (which implies
        flushing log) or only log or nothing); all such "ways of crashing"
        are tested in ma_test_recovery
      storage/maria/ma_test_recovery:
        * Testing execution of UNDOs, with and without BLOBs.
        * Testing idempotency of REDOs.
        * See @todo for a probable bug with BLOBs.
        * maria_chk -rq instead of -r, as with -q it nicely stops on any
        problem in the data file (like the checksum bug see comment of
        ma_blockrec.c).
        * Testing if log was written by UNDO phase (often expected),
        not written by REDO phase (always expected).
        * Less output on the screen, compares with expected output in the end.
        * some shell thingies like "set --" and $# are courtesy of
        Danny and Pekka.
      storage/maria/maria_read_log.c:
        when only displaying the records, don't do an UNDO phase
      storage/maria/ma_test_recovery.expected:
        This is the expected output of a great part of ma_test_recovery.
        ma_test_recovery compares its output to the expected output
        and tells if different.
        If we look at this file it mentions differences in checksum
        (normal, it's not recovered yet) and in records count
        (getting a correct records' count when recovery starts on an
        already existing table, like when testing rollback,
        is coded but not yet pushed).
      d1886afc
  7. 04 Sep, 2007 6 commits
    • unknown's avatar
      Merge bk-internal.mysql.com:/home/bk/mysql-maria · fdd6160b
      unknown authored
      into  mysql.com:/home/my/mysql-maria
      
      fdd6160b
    • unknown's avatar
      Added undo of deleted row · f366cd84
      unknown authored
      Added part of undo of update row
      Extended ma_test1 for recovery testing
      Some bug fixes
      
      
      storage/maria/ha_maria.cc:
        Ignore 'state.split' in case of block records
      storage/maria/ma_bitmap.c:
        Added return value for _ma_bitmap_find_place() for how much data we should put on head page
      storage/maria/ma_blockrec.c:
        Added undo of deleted row.
        - Added logging of CLR_END records in write_block_record()
        - Split ma_write_init_block_record() to two functions to get better code reuse
        - Added _ma_apply_undo_row_delete()
        - Added ma_get_length()
        
        Added 'empty' prototype for undo_row_update()
        
        Fixed bug when moving data withing a head/tail page.
        Fixed bug when reading a page with bigger LSN but of different type than was expected.
        Store undo_lsn first in CLR_END record
        
        Simplified some code by adding local variables.
        Changed log format for UNDO_ROW_DELETE to store total length of used blobs
      storage/maria/ma_blockrec.h:
        Added prototypes for undo code.
      storage/maria/ma_pagecache.c:
        Allow plain page to change to LSN page (needed in recovery to apply UNDO)
      storage/maria/ma_recovery.c:
        Added undo handling of UNDO_ROW_DELETE and UNDO_ROW_UPDATE
      storage/maria/ma_test1.c:
        Extended --test-undo option to allow us to die after insert or after delete.
        Fixed bug in printing key values when using -v
      storage/maria/maria_def.h:
        Moved some variables around to be getter alignment
        Added length_buff buffer to be used during undo handling
      f366cd84
    • unknown's avatar
      Check of transaction log descriptor table consistance added. · cedf65fb
      unknown authored
      Small fixes made.
      
      
      storage/maria/ma_loghandler.c:
        Check of transaction log descriptor table consistance added.\
        Incorrect record description fixed.
        Compiler warning fixed.
      storage/maria/ma_loghandler.h:
        fixed ident.
      storage/maria/unittest/ma_test_loghandler-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_first_lsn-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_max_lsn-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_multithread-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_noflush-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
        Suppressing of automatic record writing
      storage/maria/unittest/ma_test_loghandler_purge-t.c:
        Suppressing of automatic record writing
      cedf65fb
    • unknown's avatar
      Merge bk-internal.mysql.com:/home/bk/mysql-maria · 4d44697d
      unknown authored
      into  mysql.com:/home/my/mysql-maria
      
      
      storage/maria/ma_pagecache.c:
        Auto merged
      4d44697d
    • unknown's avatar
      Added UNDO handling of insert during recovery · 5f473af7
      unknown authored
      
      storage/maria/ma_blockrec.c:
        Added UNDO handling of insert during recovery
        To do this, I also had to add write locking of tail pages during undo phase (As we need to access the same page twice if extents are split over two pages)
        Another way to handle the undo of insert would be to store the extent information as part of the UNDO_INSERT block.
      storage/maria/ma_blockrec.h:
        Added new prototype
      storage/maria/ma_loghandler.c:
        Changed type of CLR_END (to avoid crash in log handler)
        Removed not used variable
      storage/maria/ma_loghandler.h:
        Added TRN argument to record_execute_in_undo_phase()
      storage/maria/ma_pagecache.c:
        Hack for undo phase of recovery.  During REDO we work with PLAIN pages, but UNDO works with LSN pages, which caused an abort when trying to access a cached page.
      storage/maria/ma_recovery.c:
        Added execution of UNDO_ROW_INSERT
      storage/maria/ma_test1.c:
        Added option --test-undo for testing recovery with undo
      storage/maria/maria_read_log.c:
        Added processing of undos
      5f473af7
    • unknown's avatar
      Spelling of comments fixed. · a564b67d
      unknown authored
      a564b67d
  8. 03 Sep, 2007 3 commits
    • unknown's avatar
    • unknown's avatar
      Merge bk-internal.mysql.com:/home/bk/mysql-maria · 1bcd63a2
      unknown authored
      into  mysql.com:/home/my/mysql-maria
      
      1bcd63a2
    • unknown's avatar
      Fixed several bugs found by running *.test with maria engine · 50cd8e3f
      unknown authored
      Renamed HA_EXTRA_PREAPRE_FOR_DELETE to HA_EXTRA_PERPARE_FOR_DROP
      Added HA_EXTRA_PREPARE_FOR_RENAME (as we in the code before used HA_EXTRA_PREPARE_FOR_DELETE also for renames which confused things)
      Allow multiple write locks for same page by same file handle
      Don't write table state if table is not changed
      
      
      include/my_base.h:
        Renamed HA_EXTRA_PREAPRE_FOR_DELETE to HA_EXTRA_PERPARE_FOR_DROP
        Added HA_EXTRA_PREPARE_FOR_RENAME (as we in the code before used HA_EXTRA_PREPARE_FOR_DELETE also for renames which confused things)
      mysql-test/r/maria.result:
        More tests of things that failed in other tests
      mysql-test/t/maria.test:
        More tests of things that failed in other tests
      sql/ha_partition.cc:
        HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP
        Use HA_EXTRA_PREPARE_FOR_RENAME for renames
      sql/ha_partition.h:
        HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP
        Use HA_EXTRA_PREPARE_FOR_RENAME for renames
      sql/lock.cc:
        Fixed comment
      sql/sql_table.cc:
        Fixed wrong usage of HA_EXTRA_PREAPRE_FOR_DELETE
      storage/maria/ha_maria.cc:
        Added missing _ma_renable_logging_for_table()  (When using with ALTER TABLE + repair index)
        Enabled fast generation of index
      storage/maria/ma_bitmap.c:
        Fixed bug when resetting full pages when page was a tail page
      storage/maria/ma_blockrec.c:
        Fixed several bugs found by running *.test with maria engine:
        During update we keep old changed pages locked with a write lock to be able to reuse them.
        - Fixed bug with allocated but not used tail part
        - Fixed bug with blob that only had tail part
        - Fixed bug when update reused a page (needed multiple write locks for same page)
        - Fixed bug when first extent was a tail block
      storage/maria/ma_check.c:
        Better error message when bitmap is destroyed
      storage/maria/ma_close.c:
        Only write status if file was changed.
        Fixed bug when maria_chk -e file_name changed the file.
      storage/maria/ma_dynrec.c:
        Removed not used argument to _ma_state_info_read_dsk
      storage/maria/ma_extra.c:
        HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP
        Use HA_EXTRA_PREPARE_FOR_RENAME for renames
        Only ignore flushing of pages for DROP (not rename)
      storage/maria/ma_locking.c:
        Removed not used argument to _ma_state_info_read_dsk
      storage/maria/ma_open.c:
        Removed not used argument to _ma_state_info_read_dsk
      storage/maria/ma_pagecache.c:
        Allow multiple write locks for same page by same file handle
        (Not yet complete, Sanja will fix)
      storage/maria/ma_recovery.c:
        HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP
      storage/maria/maria_def.h:
        Removed not used argument to _ma_state_info_read_dsk
      storage/myisam/mi_extra.c:
        HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP
        Use HA_EXTRA_PREPARE_FOR_RENAME for renames
        Only ignore flushing of pages for DROP (not rename)
      storage/myisammrg/ha_myisammrg.cc:
        HA_EXTRA_PREPARE_FOR_DELETE -> HA_EXTRA_PREPARE_FOR_DROP
        Use HA_EXTRA_PREPARE_FOR_RENAME for renames
      50cd8e3f
  9. 31 Aug, 2007 5 commits
    • unknown's avatar
      Merge desktop.sanja.is.com.ua:/home/bell/mysql/bk/mysql-maria · ac74b9c6
      unknown authored
      into  desktop.sanja.is.com.ua:/home/bell/mysql/bk/work-maria-purge
      
      
      storage/maria/ma_loghandler.c:
        Auto merged
      storage/maria/ma_loghandler.h:
        Auto merged
      ac74b9c6
    • unknown's avatar
      Merge desktop.sanja.is.com.ua:/home/bell/mysql/bk/mysql-maria · 35e77849
      unknown authored
      into  desktop.sanja.is.com.ua:/home/bell/mysql/bk/work-maria-test
      
      35e77849
    • unknown's avatar
      Merge bk-internal.mysql.com:/home/bk/mysql-maria · 95e47d88
      unknown authored
      into  mysql.com:/home/my/mysql-maria
      
      
      storage/maria/ma_blockrec.c:
        Auto merged
      storage/maria/ma_open.c:
        SCCS merged
      95e47d88
    • unknown's avatar
      Generalized the way update and redo extends the size of a directory record. · d8cfe620
      unknown authored
      
      storage/maria/ma_blockrec.c:
        Generalized the way update and redo extends the size of a directory record.
        This will (for now) ensure that data files are idenitical after normal run and after a apply-log run.
      storage/maria/ma_open.c:
        Disabled reservation of transid on rows (for now) as these are not yet used.
        (I had to disable this as otherwise update thougth rows had grown in size when they hadn't and we had thus different row sizes on update and redo, which caused different block information)
      storage/maria/ma_test1.c:
        Added comment
      storage/maria/ma_test2.c:
        Do commit on error/abort
      storage/maria/ma_test_all.sh:
        Some more testing (to cover a bug that was not found in previous runs)
      storage/maria/ma_test_recovery:
        More tests
      d8cfe620
    • unknown's avatar
      Fixed bug in log "in progress" marking. · a304e187
      unknown authored
      
      storage/maria/ma_loghandler.c:
        Comments fixed.
        Fixed loop starting value.
      a304e187
  10. 29 Aug, 2007 4 commits
    • unknown's avatar
      WL#3072 Maria recovery · 6ef2553e
      unknown authored
      manual merge of ma_recovery.c (too big conflict to resolve in fmtool);
      the merged Monty's code allows correct replaying of REDO_PURGE_BLOCKS
      and was originally in
      monty@mysql.com/narttu.mysql.fi|ChangeSet|20070829060310|44058
      
      
      storage/maria/ma_recovery.c:
        * manually merging Monty's and Sanja's changes of the two last weeks
        to my massively modified version of this file. The merged Monty's
        code allows correct replaying of REDO_PURGE_BLOCKS and was originally
        in monty@mysql.com/narttu.mysql.fi|ChangeSet|20070829060310|44058 .
        * Setting the state to "STATE_CHANGED|etc" in Recovery is more
        logically done when we update the state in memory (for example
        records++).
      6ef2553e
    • unknown's avatar
      Merge gbichot@bk-internal.mysql.com:/home/bk/mysql-maria · e251add4
      unknown authored
      into  gbichot4.local:/home/mysql_src/mysql-maria-for-undo-phase
      
      
      storage/maria/ha_maria.cc:
        Auto merged
      storage/maria/ma_blockrec.c:
        Auto merged
      storage/maria/ma_loghandler.c:
        Auto merged
      storage/maria/ma_loghandler.h:
        Auto merged
      storage/maria/ma_loghandler_lsn.h:
        Auto merged
      storage/maria/maria_chk.c:
        Auto merged
      storage/maria/maria_read_log.c:
        Auto merged
      e251add4
    • unknown's avatar
      cleanups · 6302357d
      unknown authored
      
      storage/maria/ma_commit.c:
        theoretically unneeded, and could cause problems (when trnman_commit_trn()
        ends the TRN may have been recycled and be in use by another thread
        already, we cannot touch it).
      storage/maria/maria_def.h:
        just include the existing file
      6302357d
    • unknown's avatar
      WL#3072 Maria recovery · da153a3b
      unknown authored
      * create page cache before initializing engine and not after, because
      Maria's recovery needs a page cache
      * make the creation of a bitmap page more crash-resistent
      * bugfix (see ma_blockrec.c)
      * back to old way: create an 8k bitmap page when creating table
      * preparations for the UNDO phase: recreate TRNs
      * preparations for Checkpoint: list of dirty pages, testing
      of rec_lsn to know if page should be skipped during Recovery
      (unused in this patch as no Checkpoint module pushed yet)
      * maria_chk tags repaired table with a special LSN
      * reworking all around in ma_recovery.c (less duplication)
      
      
      mysys/my_realloc.c:
        noted an issue in my_realloc()
      sql/mysqld.cc:
        page cache needs to be created before engines are initialized,
        because Maria's initialization may do a recovery which needs
        the page cache.
      storage/maria/ha_maria.cc:
        update to new prototype
      storage/maria/ma_bitmap.c:
        when creating the first bitmap page we used chsize to 8192 bytes then 
        pwrite (overwrite) the last 2 bytes (8191-8192). If crash between
        the two operations, this leaves a bitmap page full without its end
        marker. A later recovery may try to read this page and find it
        exists and misses a marker and conclude it's corrupted and fail.
        Changing the chsize to only 8190 bytes: recovery will then find
        the page is too short and recreate it entirely.
      storage/maria/ma_blockrec.c:
        Fix for a bug: when executing a REDO, if the data page is created,
        data_file_length was increased before _ma_bitmap_set():
        _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the
        increased data_file_length, expected to find a bitmap page on disk
        with a correct end marker; if the bitmap page didn't exist already
        in fact, this failed. Fixed by increasing data_file_length only after
        _ma_read_bitmap_page() has created the new bitmap page correctly.
        This bug could happen every time a REDO is about creating a new
        bitmap page.
      storage/maria/ma_check.c:
        empty data file has a bitmap page
      storage/maria/ma_control_file.c:
        useless parameter to ma_control_file_create_or_open(), just
        test if this is recovery.
      storage/maria/ma_control_file.h:
        new prototype
      storage/maria/ma_create.c:
        Back to how it was before: maria_create() creates an 8k bitmap page.
        Thus (bugfix) data_file_length needs to reflect this instead of being 0.
      storage/maria/ma_loghandler.c:
        as ma_test1 and ma_test2 now use real transactions and not
        dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always
        about real transactions, can assert this.
        A function for Recovery to assign a short id to a table.
      storage/maria/ma_loghandler.h:
        new function
      storage/maria/ma_loghandler_lsn.h:
        maria_chk tags repaired tables with this LSN
      storage/maria/ma_open.c:
        * enforce that DMLs on transactional tables use real transactions
        and not dummy_transaction_object.
        * test if table was repaired with maria_chk (which has to been
        seen as an import of an external table into the server), test
        validity of create_rename_lsn (header corruption detection)
        * comments.
      storage/maria/ma_recovery.c:
        * preparations for the UNDO phase: recreate TRNs
        * preparations for Checkpoint: list of dirty pages, testing
        of rec_lsn to know if page should be skipped during Recovery
        (unused in this patch as no Checkpoint module pushed yet)
        * reworking all around (less duplication)
      storage/maria/ma_recovery.h:
        a parameter to say if the UNDO phase should be skipped
      storage/maria/maria_chk.c:
        tag repaired tables with a special LSN
      storage/maria/maria_read_log.c:
        * update to new prototype
        * no UNDO phase in maria_read_log for now
      storage/maria/trnman.c:
        * a function for Recovery to create a transaction (TRN), needed
        in the UNDO phase
        * a function for Recovery to grab an existing transaction, needed
        in the UNDO phase (rollback all existing transactions)
      storage/maria/trnman_public.h:
        new functions
      da153a3b