• unknown's avatar
    - WL#3239 "log CREATE TABLE in Maria" · de28fd57
    unknown authored
    - WL#3240 "log DROP TABLE in Maria"
    - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and
    DELETE no_WHERE_clause (== the DELETE which just truncates the files)
    - create_rename_lsn added to MARIA_SHARE's state
    - all these operations (except DROP TABLE) also update the table's
    create_rename_lsn, which is needed for the correctness of
    Recovery (see function comment of _ma_repair_write_log_record()
    in ma_check.c)
    - write a COMMIT record when transaction commits.
    - don't log REDOs/UNDOs if this is an internal temporary table
    like inside ALTER TABLE (I expect this to be a big win). There was
    already no logging for user-created "CREATE TEMPORARY" tables.
    - don't fsync files/directories if the table is not transactional
    - in translog_write_record(), autogenerate a 2-byte-id for the table
    and log the "id->name" pair (LOGREC_FILE_ID); log
    LOGREC_LONG_TRANSACTION_ID; automatically store
    the table's 2-byte-id in any log record.
    - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint
    when some dirty pages are unknown; capturing trn->rec_lsn,
    trn->first_undo_lsn for Checkpoint and log's low-water-mark computing.
    - assertions, comments.
    
    
    storage/maria/Makefile.am:
      more files to build
    storage/maria/ha_maria.cc:
      - logging a REPAIR log record if REPAIR/OPTIMIZE was successful.
      - ha_maria::data_file_type does not have to be set in every info()
      call, just do it once in open().
      - if caller said that transactionality can be disabled (like if
      caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we
      temporarily disable transactionality of the table in external_lock();
      that will ensure that no REDOs/UNDOs are logged for this possibly
      massive write operation (they are not needed, as if any write fails,
      the table will be dropped). We re-enable in external_lock(F_UNLCK),
      which in ALTER TABLE happens before the tmp table replaces the original
      one (which is good, as thus the final table will have a REDO RENAME
      and a correct create_rename_lsn).
      - when we commit we also have to write a log record, so
      trnman_commit_trn() calls become ma_commit() calls
      - at end of engine's initialization, we are potentially entering a
      multi-threaded dangerous world (clients are going to be accepted)
      and so some assertions of mutex-owning become enforceable, for that
      we set maria_multi_threaded=TRUE (see ma_control_file.c)
    storage/maria/ha_maria.h:
      new member ha_maria::save_transactional (see also ha_maria.cc)
    storage/maria/ma_blockrec.c:
      - fixing comments according to discussion with Monty
      - if a table is transactional but temporarily non-transactional
      (like in ALTER TABLE), we need to give a sensible LSN to the pages
      (and, if we give 0, pagecache asserts).
      - translog_write_record() now takes care of storing the share's
      2-byte-id in the log record
    storage/maria/ma_blockrec.h:
      fixing comment according to discussion with Monty
    storage/maria/ma_check.c:
      When REPAIR/OPTIMIZE modify the data/index file, if this is a
      transactional table, they must sync it; if they remove files or rename
      files, they must sync the directory, so that everything is durable.
      This is just applying to REPAIR/OPTIMIZE the logic already implemented
      in CREATE/DROP/RENAME a few months ago.
      Adding a function to write a LOGREC_REPAIR_TABLE at end of
      REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and
      to update the table's create_rename_lsn.
    storage/maria/ma_close.c:
      fix for a future bug
    storage/maria/ma_control_file.c:
      ensuring that if Maria is running in multi-threaded mode, anybody
      wanting to write to the control file and update
      last_checkpoint_lsn/last_logno owns the log's lock.
    storage/maria/ma_control_file.h:
      see ma_control_file.c
    storage/maria/ma_create.c:
      when creating a table:
      - sync it and its directory only if this is a transactional table
      and there is a log (no point in syncing in maria_chk)
      - decouple the two uses of linkname/linkname_ptr (for index file and
      for data file) into more variables, as we need to know all links
      until the moment we write the LOGREC_CREATE_TABLE.
      - set share.data_file_type early so that _ma_initialize_data_file()
      knows it (Monty's bugfix so that a table always has at least a bitmap
      page when it is created; so data-file is not 0 bytes anymore).
      - log a LOGREC_CREATE_TABLE; it contains the bytes which we have
      just written to the index file's header. Update table's
      create_rename_lsn.
      - syncing of kfile had been bugified in a previous merge, correcting
      - syncing of dfile is now needed as it's not empty anymore
      - in _ma_initialize_data_file(), use share's block_size and not the
      global one. This is a gratuitous change, both variables are equal,
      just that I find it more future-proof to use share-bound variable
      rather than global one.
    storage/maria/ma_delete_all.c:
      log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows();
      update create_rename_lsn then.
    storage/maria/ma_delete_table.c:
      - logging LOGREC_DROP_TABLE; knowing if this is needed, requires
      knowing if the table is transactional, which requires opening the
      table.
      - we need to sync directories only if the table is transactional
    storage/maria/ma_extra.c:
      questions
    storage/maria/ma_init.c:
      when maria_end() is called, engine is not multithreaded
    storage/maria/ma_loghandler.c:
      - translog_inited has to be visible to ma_create() (see how it is used
      in ma_create())
      - checkpoint record will be a single record, not three
      - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will
      log a REDO_CREATE)
      - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by
      truncating the files), REPAIR.
      - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk
      - in translog_write_record(), if MARIA_SHARE does not yet have a
      2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically
      store this short id into log records.
      - in translog_write_record(), if transaction has not logged its
      long trid, log LOGREC_LONG_TRANSACTION_ID.
      - For Checkpoint, we need to know the current end-of-log: adding
      translog_get_horizon().
      - For Control File, adding an assertion that the thread owns the
      log's lock (control file is protected by this lock)
    storage/maria/ma_loghandler.h:
      Changes in log records (see ma_loghandler.c).
      new prototypes, new functions.
    storage/maria/ma_loghandler_lsn.h:
      adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn,
      where the most significant byte is used for flags.
    storage/maria/ma_open.c:
      storing the create_rename_lsn in the index file's header (in the
      state, precisely) and retrieving it from there.
    storage/maria/ma_pagecache.c:
      - my set_if_bigger was wrong, correcting it
      - if the first_in_switch list is not empty, it means that
      changed_blocks misses some dirty pages, so Checkpoint cannot run and
      needs to wait. A variable missing_blocks_in_changed_list is added to
      tell that (should it be named missing_blocks_in_changed_blocks?)
      - pagecache_collect_changed_blocks_with_lsn() now also tells the
      minimum rec_lsn (needed for low-water mark computation).
    storage/maria/ma_pagecache.h:
      see ma_pagecache.c
    storage/maria/ma_panic.c:
      comment
    storage/maria/ma_range.c:
      comment
    storage/maria/ma_rename.c:
      - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires
      knowing if the table is transactional, which requires opening the
      table.
      - update create_rename_lsn
      - we need to sync directories only if the table is transactional
    storage/maria/ma_static.c:
      comment
    storage/maria/ma_test_all.sh:
      - tip for Valgrind-ing ma_test_all
      - do "export maria_path=somepath" before calling ma_test_all,
      if you want to run ma_test_all out of storage/maria (useful
      to have parallel runs, like one normal and one Valgrind, they
      must not use the same tables so need to run in different directories)
    storage/maria/maria_def.h:
      - state now contains, in memory and on disk, the create_rename_lsn
      - share now contains a 2-byte-id
    storage/maria/trnman.c:
      preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn;
      minimum first_undo_lsn needed to know log's low-water-mark
    storage/maria/trnman.h:
      using most significant byte of first_undo_lsn to hold miscellaneous
      flags, for now TRANSACTION_LOGGED_LONG_ID.
      dummy_transaction_object is already declared in ma_static.c.
    storage/maria/trnman_public.h:
      dummy_transaction_object was declared in all files including
      trnman_public.h, while in fact it's a single object.
      new prototype
    storage/maria/unittest/ma_test_loghandler-t.c:
      update for new prototype
    storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
      update for new prototype
    storage/maria/unittest/ma_test_loghandler_multithread-t.c:
      update for new prototype
    storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
      update for new prototype
    storage/maria/ma_commit.c:
      function which wraps:
      - writing a LOGREC_COMMIT record (==commit on disk)
      - calling trnman_commit_trn() (=commit in memory)
    storage/maria/ma_commit.h:
      new header file
    .tree-is-private:
      this file is now needed to keep our tree private (don't push it
      to public trees). When 5.1 is merged into mysql-maria, we can abandon
      our maria-specific post-commit trigger; .tree_is_private will take
      care of keeping commit mails private. Don't push this file to public
      trees.
    de28fd57
ma_loghandler.h 7.52 KB