Commit fc08f82b authored by unknown's avatar unknown

WL#3072 - Maria recovery

Unit test for recovery: runs ma_test1 and ma_test2 (both only with
INSERTs and DELETEs; UPDATEs disabled as not handled by recovery)
then moves the tables elswhere; recreates tables from the log, and
compares and fails if there is a difference. Passes now.
Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used
for recovery-from-ha_maria.
Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW.
Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE,
UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++),
UNDO_ROW_DELETE, UNDO_ROW_PURGE.
Code cleanups.
Monty: please look for "QQ". Sanja: please look for "Sanja".
Future tasks: recovery of the bitmap (easy), recovery of the state
(make it idempotent), more REDOs (Monty to work on
REDO_UPDATE?), UNDO phase...
Pushing this cset as it looks safe, contains test and bugfixes which
will help Monty implement applying of REDO_UPDATE.


sql/handler.cc:
  typo
storage/maria/Makefile.am:
  Adding ma_test_recovery (which ma_test_all invokes, and which can
  also be run alone). Most of maria_read_log.c moved to ma_recovery.c
storage/maria/ha_maria.cc:
  comments
storage/maria/ma_bitmap.c:
  fixing comments. 2 -> sizeof(maria_bitmap_marker).
  Bitmap-related part of _ma_initialize_datafile() moves in bitmap module.
  Now putting the "bm" signature when creating the first bitmap page
  (it used to happen only at next open, but that
  caused an annoying difference when testing Recovery if the original
  run didn't open the table, and it looks more
  logical like this: it goes to disk only with its signature correct);
  see the "QQ" comment towards the _ma_initialize_data_file() call
  in ma_create.c for more).
  When reading a bitmap page, verify its signature (happens when normally
  using the table or when CHECKing it; not when REPAIRing it).
storage/maria/ma_blockrec.c:
  * no need to sync the data file if table is not transactional
  * Comments, code cleanup (log-related data moved to log-related code
  block, int5store->page_store).
  * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we
  do for other records (though this record will soon be replaced
  with a CLR).
  * If "page" is 1 it means the page which extends from byte
  page*block_size+1 to (page+1)*block_size (byte number 1 being
  the first byte of the file). The last byte of the file is
  data_file_length (same convention).
  A new page needs to be created if the last byte of the page is
  beyond the last byte of the file, i.e.
   (page+1)*block_size+1 > data_file_length, so we correct the test
  (bug found when testing log applying for ma_test1 -M -T --skip-update).
  * update the page's LSN when removing a row from it during
  execution of a REDO_PURGE_ROW record (bug found when testing log
  applying for ma_test1 -M -T --skip-update).
  * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now).
storage/maria/ma_blockrec.h:
  new functions. maria_bitmap_marker does not need to be exported.
storage/maria/ma_close.c:
  we can always flush the table's state when closing the last instance
  of the table. And it is needed for maria_read_log (as it does
  not use maria_lock_database()).
storage/maria/ma_control_file.c:
  when in Recovery, some assertions should not be used.
storage/maria/ma_control_file.h:
  double-inclusion safe
storage/maria/ma_create.c:
  during recovery, don't log records. Comments.
  Moving the creation of the first bitmap page to ma_bitmap.c
storage/maria/ma_delete_table.c:
  during recovery, don't log records. Log the end-zero of the dropped
  table's name, so that recovery can use the string in place without
  extending it to fit an end zero.
storage/maria/ma_loghandler.c:
  * inwrite_rec_hook also needs access to the MARIA_SHARE, like
  prewrite_rec_hook. This will be needed to update
  share->records_diff (in the upcoming patch "recovery of the state").
  * LOG_DESC::record_ends_group changed to an enum.
  * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE
  corrected
  * Sanja please see the @todo LOG BUG
  * avoiding DBUG_RETURN(func()) as it gives confusing debug traces.
storage/maria/ma_loghandler.h:
  - log write hooks called while the log's lock is held (inwrite_rec_hook)
  now need the MARIA_SHARE, like prewrite_rec_hook already had
  - instead of a bool saying if this record's type ends groups or not,
  we refine: it may not end a group, it may end a group, or it may
  be a group in itself. Imagine that we had a physical write failure
  to a table before we log the UNDO, we still end up in
  external_lock(F_UNLCK) and then we log a COMMIT: we don't want
  to consider this COMMIT as ending the group of REDOs (don't want
  to execute those REDOs during Recovery), that's why we say "COMMIT
  is a group in itself, it aborts any previous group". This also
  gives one more sanity check in maria_read_log.
storage/maria/ma_recovery.c:
  New Recovery code, replacing the old pseudocode.
  Most of maria_read_log moved here.
  Call-able from ha_maria, but not enabled yet.
  Compared to the previous version of maria_read_log, some bugs have
  been fixed, debugging output can go to stdout or a disk file (for now
  it's useful for me, later it can be changed), execution of
  REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code
  has been factored into functions. We abort an unfinished group
  of records if we see a record which is a group in itself (like COMMIT).
  No need for maria_panic() after a bug (which caused tables to not
  be closed) was fixed; if there is yet another bug I prefer to see it.
  When opening a table for Recovery, set data_file_length
  and key_file_length to their real physical value (these are the
  easiest state members to restore :). Warn us if the last page
  was truncated (but Recovery handles it).
  MARIA_SHARE::state::state::records is now partly recovered (not
  idempotent, but works if recreating tables from scracth).
  When applying a REDO to a page, stamp it with the UNDO's LSN
  (current_group_end_lsn), not with the REDO's LSN; it makes
  the table more identical to the original table (easier to compare
  the two tables in the end).
  Big thing missing: some types of REDOs are not handled,
  and the UNDO phase does not exist (missing functions to execute UNDOs
  to actually rollback). So for now tests are only inserting/deleting
  a few 100 rows, closing the table and seeing if the log is applied ok;
  it works. UPDATE not handled.
storage/maria/ma_recovery.h:
  new functions: ma_recover() for recovery from inside ha_maria;
  _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()).
  Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore.
storage/maria/ma_rename.c:
  don't write log records during recovery
storage/maria/ma_test2.c:
  - fail if maria_info() or other subtests find some wrong information
  - new option -g to skip updates.
  - init the translog before creating the table, so that log applying
  can work.
  - in "#if 0" you'll see some fixed bugs (will be removed).
storage/maria/ma_test_all.sh:
  cleanup files. Test log applying.
storage/maria/maria_read_log.c:
  most of the logic moves to ma_recovery.c to be shared between
  maria_read_log and recovery-from-inside-mysqld.
  See ma_recovery.c for additional changes made to the moved code.
storage/maria/ma_test_recovery:
  unit test for Recovery. Tests insert and delete,
  REDO_UPDATE not yet coded.
  Script is called from ma_test_all. Can run standalone.
parent 97a41052
......@@ -2788,7 +2788,7 @@ int ha_change_key_cache(KEY_CACHE *old_key_cache,
int ha_init_pagecache(const char *name, PAGECACHE *pagecache)
{
DBUG_ENTER("ha_init_key_cache");
DBUG_ENTER("ha_init_pagecache");
if (!pagecache->inited)
{
......
......@@ -30,8 +30,8 @@ DEFS = @DEFS@
# "." is needed first because tests in unittest need libmaria
SUBDIRS = . unittest
EXTRA_DIST = ma_test_all.sh ma_test_all.res ma_ft_stem.c CMakeLists.txt plug.in
pkgdata_DATA = ma_test_all ma_test_all.res
EXTRA_DIST = ma_test_all.sh ma_test_all.res ma_ft_stem.c CMakeLists.txt plug.in ma_test_recovery
pkgdata_DATA = ma_test_all ma_test_all.res ma_test_recovery
pkglib_LIBRARIES = libmaria.a
bin_PROGRAMS = maria_chk maria_pack maria_ftdump maria_read_log
maria_chk_DEPENDENCIES= $(LIBRARIES)
......@@ -61,7 +61,7 @@ noinst_HEADERS = maria_def.h ma_rt_index.h ma_rt_key.h ma_rt_mbr.h \
ma_ft_eval.h trnman.h lockman.h tablockman.h \
ma_control_file.h ha_maria.h ma_blockrec.h \
ma_loghandler.h ma_loghandler_lsn.h ma_pagecache.h \
ma_commit.h
ma_recovery.h ma_commit.h
ma_test1_DEPENDENCIES= $(LIBRARIES)
ma_test1_LDADD= @CLIENT_EXTRA_LDFLAGS@ libmaria.a \
$(top_builddir)/storage/myisam/libmyisam.a \
......@@ -120,7 +120,7 @@ libmaria_a_SOURCES = ma_init.c ma_open.c ma_extra.c ma_info.c ma_rkey.c \
ma_rt_index.c ma_rt_key.c ma_rt_mbr.c ma_rt_split.c \
ma_sp_key.c ma_control_file.c ma_loghandler.c \
ma_pagecache.c ma_pagecaches.c \
ma_commit.c
ma_recovery.c ma_commit.c
CLEANFILES = test?.MA? FT?.MA? isam.log ma_test_all ma_rt_test.MA? sp_test.MA?
SUFFIXES = .sh
......
......@@ -37,6 +37,15 @@
#define trans_register_ha(A, B, C) do { /* nothing */ } while(0)
#endif
/**
@todo For now there is no way for a user to set a different value of
maria_recover_options, i.e. auto-check-and-repair is always disabled.
We could enable it. As the auto-repair is initiated when opened from the
SQL layer (open_unireg_entry(), check_and_repair()), it does not happen
when Maria's Recovery internally opens the table to apply log records to
it, which is good. It would happen only after Recovery, if the table is
still corrupted.
*/
ulong maria_recover_options= HA_RECOVER_NONE;
static handlerton *maria_hton;
......@@ -1877,6 +1886,10 @@ int ha_maria::external_lock(THD *thd, int lock_type)
corresponding unlock (they just stay locked and are later dropped while
locked); if a tmp table was transactional, "SELECT FROM non_tmp, tmp"
would never commit as its "locked_tables" count would stay 1.
When Maria has has_transactions()==TRUE, open_temporary_table()
(sql_base.cc) will use TRANSACTIONAL_TMP_TABLE and thus the
external_lock(F_UNLCK) will happen and we can then allow the user to
create transactional temporary tables.
*/
if (!file->s->base.born_transactional)
goto skip_transaction;
......
......@@ -130,6 +130,7 @@
#define FULL_HEAD_PAGE 4
#define FULL_TAIL_PAGE 7
/** all bitmap pages end with this 2-byte signature */
uchar maria_bitmap_marker[2]= {(uchar) 'b',(uchar) 'm'};
static my_bool _ma_read_bitmap_page(MARIA_SHARE *share,
......@@ -244,7 +245,7 @@ my_bool _ma_bitmap_end(MARIA_SHARE *share)
/*
Flush bitmap to disk
Send updated bitmap to the page cache
SYNOPSIS
_ma_flush_bitmap()
......@@ -286,7 +287,7 @@ my_bool _ma_flush_bitmap(MARIA_SHARE *share)
share Share handler
NOTES
This is called on ma_delete_all (truncate data file).
This is called on maria_delete_all_rows (truncate data file).
*/
void _ma_bitmap_delete_all(MARIA_SHARE *share)
......@@ -294,8 +295,9 @@ void _ma_bitmap_delete_all(MARIA_SHARE *share)
MARIA_FILE_BITMAP *bitmap= &share->bitmap;
if (bitmap->map) /* Not in create */
{
bzero(bitmap->map, share->block_size);
memcpy(bitmap->map + share->block_size - 2, maria_bitmap_marker, 2);
bzero(bitmap->map, bitmap->block_size);
memcpy(bitmap->map + bitmap->block_size - sizeof(maria_bitmap_marker),
maria_bitmap_marker, sizeof(maria_bitmap_marker));
bitmap->changed= 1;
bitmap->page= 0;
bitmap->used_size= bitmap->total_size;
......@@ -497,6 +499,10 @@ static void _ma_print_bitmap(MARIA_FILE_BITMAP *bitmap)
TODO
Update 'bitmap->used_size' to real size of used bitmap
NOTE
We don't always have share->bitmap.bitmap_lock here
(when called from_ma_check_bitmap_data() for example).
RETURN
0 ok
1 error (Error writing old bitmap or reading bitmap page)
......@@ -516,7 +522,8 @@ static my_bool _ma_read_bitmap_page(MARIA_SHARE *share,
{
share->state.state.data_file_length= position + bitmap->block_size;
bzero(bitmap->map, bitmap->block_size);
memcpy(bitmap->map + share->block_size - 2, maria_bitmap_marker, 2);
memcpy(bitmap->map + bitmap->block_size - sizeof(maria_bitmap_marker),
maria_bitmap_marker, sizeof(maria_bitmap_marker));
bitmap->used_size= 0;
#ifndef DBUG_OFF
memcpy(bitmap->map + bitmap->block_size, bitmap->map, bitmap->block_size);
......@@ -525,11 +532,14 @@ static my_bool _ma_read_bitmap_page(MARIA_SHARE *share,
}
bitmap->used_size= bitmap->total_size;
DBUG_ASSERT(share->pagecache->block_size == bitmap->block_size);
res= pagecache_read(share->pagecache,
(PAGECACHE_FILE*)&bitmap->file, page, 0,
(byte*) bitmap->map,
PAGECACHE_PLAIN_PAGE,
PAGECACHE_LOCK_LEFT_UNLOCKED, 0) == 0;
res= (pagecache_read(share->pagecache,
(PAGECACHE_FILE*)&bitmap->file, page, 0,
(byte*) bitmap->map,
PAGECACHE_PLAIN_PAGE,
PAGECACHE_LOCK_LEFT_UNLOCKED, 0) == NULL) |
memcmp(bitmap->map + bitmap->block_size -
sizeof(maria_bitmap_marker),
maria_bitmap_marker, sizeof(maria_bitmap_marker));
#ifndef DBUG_OFF
if (!res)
memcpy(bitmap->map + bitmap->block_size, bitmap->map, bitmap->block_size);
......@@ -1630,9 +1640,16 @@ static my_bool set_page_bits(MARIA_HA *info, MARIA_FILE_BITMAP *bitmap,
bitmap->changed= 1;
DBUG_EXECUTE("bitmap", _ma_print_bitmap(bitmap););
if (fill_pattern != 3 && fill_pattern != 7 &&
bitmap_page < info->s->state.first_bitmap_with_space)
info->s->state.first_bitmap_with_space= bitmap_page;
if (fill_pattern != 3 && fill_pattern != 7)
set_if_smaller(info->s->state.first_bitmap_with_space, bitmap_page);
/*
Note that if the condition above is false (page is full), and all pages of
this bitmap are now full, and that bitmap page was
first_bitmap_with_space, we don't modify first_bitmap_with_space, indeed
its value still tells us where to start our search for a bitmap with space
(which is for sure after this full one).
That does mean that first_bitmap_with_space is only a lower bound.
*/
DBUG_RETURN(0);
}
......@@ -1747,8 +1764,7 @@ my_bool _ma_reset_full_page_bits(MARIA_HA *info, MARIA_FILE_BITMAP *bitmap,
tmp= (1 << bit_count) - 1;
*data&= ~tmp;
}
if (bitmap_page < info->s->state.first_bitmap_with_space)
info->s->state.first_bitmap_with_space= bitmap_page;
set_if_smaller(info->s->state.first_bitmap_with_space, bitmap_page);
bitmap->changed= 1;
DBUG_EXECUTE("bitmap", _ma_print_bitmap(bitmap););
DBUG_RETURN(0);
......@@ -2014,3 +2030,28 @@ my_bool _ma_check_if_right_bitmap_type(MARIA_HA *info,
DBUG_ASSERT(0);
return 1;
}
/**
@brief create the first bitmap page of a freshly created data file
@param share table's share
@return Operation status
@retval 0 OK
@retval !=0 Error
*/
int _ma_bitmap_create_first(MARIA_SHARE *share)
{
uint block_size= share->bitmap.block_size;
File file= share->bitmap.file.file;
if (my_chsize(file, block_size, 0, MYF(MY_WME)) ||
my_pwrite(file, maria_bitmap_marker, sizeof(maria_bitmap_marker),
block_size - sizeof(maria_bitmap_marker),
MYF(MY_NABP | MY_WME)))
return 1;
share->state.state.data_file_length= block_size;
_ma_bitmap_delete_all(share);
return 0;
}
This diff is collapsed.
......@@ -105,8 +105,6 @@ enum en_page_type { UNALLOCATED_PAGE, HEAD_PAGE, TAIL_PAGE, BLOB_PAGE, MAX_PAGE_
/* Don't allocate memory for too many row extents on the stack */
#define ROW_EXTENTS_ON_STACK 32
extern uchar maria_bitmap_marker[2];
/* Functions to convert MARIA_RECORD_POS to/from page:offset */
static inline MARIA_RECORD_POS ma_recordpos(ulonglong page, uint dir_entry)
......@@ -178,6 +176,7 @@ my_bool _ma_check_if_right_bitmap_type(MARIA_HA *info,
ulonglong page,
uint *bitmap_pattern);
void _ma_bitmap_delete_all(MARIA_SHARE *share);
int _ma_bitmap_create_first(MARIA_SHARE *share);
uint _ma_apply_redo_insert_row_head_or_tail(MARIA_HA *info, LSN lsn,
uint page_type,
const byte *header,
......@@ -186,3 +185,5 @@ uint _ma_apply_redo_insert_row_head_or_tail(MARIA_HA *info, LSN lsn,
uint _ma_apply_redo_purge_row_head_or_tail(MARIA_HA *info, LSN lsn,
uint page_type,
const byte *header);
uint _ma_apply_redo_purge_blocks(MARIA_HA *info, LSN lsn,
const byte *header);
......@@ -87,7 +87,7 @@ int maria_close(register MARIA_HA *info)
may be using the file at this point
IF using --external-locking, which does not apply to Maria.
*/
if (share->mode != O_RDONLY && maria_is_crashed(info))
if (share->mode != O_RDONLY)
_ma_state_info_write(share->kfile.file, &share->state, 1);
if (my_close(share->kfile.file, MYF(0)))
error= my_errno;
......
......@@ -51,6 +51,8 @@ uint32 last_logno= FILENO_IMPOSSIBLE;
it is called at startup.
*/
my_bool maria_multi_threaded= FALSE;
/** @brief if currently doing a recovery */
my_bool maria_in_recovery= FALSE;
/*
Control file is less then 512 bytes (a disk sector),
......
......@@ -18,6 +18,9 @@
First version written by Guilhem Bichot on 2006-04-27.
*/
#ifndef _ma_control_file_h
#define _ma_control_file_h
#define CONTROL_FILE_BASE_NAME "maria_log_control"
/* Here is the interface of this module */
......@@ -33,7 +36,7 @@ extern LSN last_checkpoint_lsn;
*/
extern uint32 last_logno;
extern my_bool maria_multi_threaded;
extern my_bool maria_multi_threaded, maria_in_recovery;
typedef enum enum_control_file_error {
CONTROL_FILE_OK= 0,
......@@ -74,3 +77,4 @@ int ma_control_file_end();
#ifdef __cplusplus
}
#endif
#endif
......@@ -677,7 +677,7 @@ int maria_create(const char *name, enum data_file_type datafile_type,
/* max_data_file_length and max_key_file_length are recalculated on open */
if (tmp_table)
share.base.max_data_file_length= (my_off_t) ci->data_file_length;
else if (ci->transactional && translog_inited)
else if (ci->transactional && translog_inited && !maria_in_recovery)
{
/*
we have checked translog_inited above, because maria_chk may call us
......@@ -940,23 +940,31 @@ int maria_create(const char *name, enum data_file_type datafile_type,
for (i= TRANSLOG_INTERNAL_PARTS;
i < (sizeof(log_array)/sizeof(log_array[0])); i++)
total_rec_length+= log_array[i].length;
/*
For this record to be of any use for Recovery, we need the upper
MySQL layer to be crash-safe, which it is not now (that would require
work using the ddl_log of sql/sql_table.cc); when it is, we should
reconsider the moment of writing this log record (before or after op,
under THR_LOCK_maria or not...), how to use it in Recovery.
For now this record can serve when we apply logs to a backup,
so we sync it. This happens before the data file is created. If the data
file was created before, and we crashed before writing the log record,
at restart the table may be used, so we would not have a trustable
history in the log (impossible to apply this log to a backup). The way
we do it, if we crash before writing the log record then there is no
data file and the table cannot be used.
Note that in case of TRUNCATE TABLE we also come here.
When in CREATE/TRUNCATE (or DROP or RENAME or REPAIR) we have not called
external_lock(), so have no TRN. It does not matter, as all these
operations are non-transactional and sync their files.
/**
For this record to be of any use for Recovery, we need the upper
MySQL layer to be crash-safe, which it is not now (that would require
work using the ddl_log of sql/sql_table.cc); when it is, we should
reconsider the moment of writing this log record (before or after op,
under THR_LOCK_maria or not...), how to use it in Recovery.
For now this record can serve when we apply logs to a backup,
so we sync it. This happens before the data file is created. If the
data file was created before, and we crashed before writing the log
record, at restart the table may be used, so we would not have a
trustable history in the log (impossible to apply this log to a
backup). The way we do it, if we crash before writing the log record
then there is no data file and the table cannot be used.
@todo Note that in case of TRUNCATE TABLE we also come here; for
Recovery to be able to finish TRUNCATE TABLE, instead of leaving a
half-truncated table, we should log the record at start of
maria_create(); for that we shouldn't write to the index file but to a
buffer (DYNAMIC_STRING), put the buffer into the record, then put the
buffer into the index file (so, change _ma_keydef_write() etc). That
would also enable Recovery to finish a CREATE TABLE. The final result
would be that we would be able to finish what the SQL layer has asked
for: it would be atomic.
When in CREATE/TRUNCATE (or DROP or RENAME or REPAIR) we have not
called external_lock(), so have no TRN. It does not matter, as all
these operations are non-transactional and sync their files.
*/
if (unlikely(translog_write_record(&share.state.create_rename_lsn,
LOGREC_REDO_CREATE_TABLE,
......@@ -1016,6 +1024,20 @@ int maria_create(const char *name, enum data_file_type datafile_type,
goto err;
errpos=3;
/*
QQ: this sets data_file_length from 0 to 8192, but we wrote the state
already to the index file (because:
- log record is built from index header so state must be written before
log record
- data file must be created after log record, so that "missing log
record" implies "unusable table").
Thus, we below create a 8192-byte data file, but its recorded size is 0,
so next time we read the bitmap (a maria_write() for example) we'll
overwrite the bitmap we just created below.
It's not very efficient. Though there is no bug.
Why do we absolutely want to create a 8192-byte page for a freshly
created, empty table? Why don't we leave the data file empty?
*/
if (_ma_initialize_data_file(&share, dfile))
goto err;
}
......@@ -1159,11 +1181,14 @@ int _ma_initialize_data_file(MARIA_SHARE *share, File dfile)
{
if (share->data_file_type == BLOCK_RECORD)
{
if (my_chsize(dfile, share->base.block_size, 0, MYF(MY_WME)))
return 1;
share->state.state.data_file_length= share->base.block_size;
_ma_bitmap_delete_all(share);
share->bitmap.block_size= share->base.block_size;
share->bitmap.file.file = dfile;
return _ma_bitmap_create_first(share);
}
/*
So, in BLOCK_RECORD, a freshly created datafile is one page long; while in
other formats it is 0-byte long.
*/
return 0;
}
......
......@@ -64,7 +64,8 @@ int maria_delete_table(const char *name)
raid_type= info->s->base.raid_type;
raid_chunks= info->s->base.raid_chunks;
#endif
sync_dir= (info->s->now_transactional && !info->s->temporary) ?
sync_dir= (info->s->now_transactional && !info->s->temporary &&
!maria_in_recovery) ?
MY_SYNC_DIR : 0;
maria_close(info);
}
......@@ -85,7 +86,7 @@ int maria_delete_table(const char *name)
LSN lsn;
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 1];
log_array[TRANSLOG_INTERNAL_PARTS + 0].str= (char *)name;
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= strlen(name);
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= strlen(name) + 1;
if (unlikely(translog_write_record(&lsn, LOGREC_REDO_DROP_TABLE,
&dummy_transaction_object, NULL,
log_array[TRANSLOG_INTERNAL_PARTS +
......
This diff is collapsed.
......@@ -289,7 +289,7 @@ typedef my_bool(*prewrite_rec_hook) (enum translog_record_type type,
struct st_translog_parts *parts);
typedef my_bool(*inwrite_rec_hook) (enum translog_record_type type,
TRN *trn,
TRN *trn, struct st_maria_share *share,
LSN *lsn,
struct st_translog_parts *parts);
......@@ -309,6 +309,11 @@ enum record_class
/* C++ can't bear that a variable's name is "class" */
#ifndef __cplusplus
enum enum_record_in_group {
LOGREC_NOT_LAST_IN_GROUP= 0, LOGREC_LAST_IN_GROUP, LOGREC_IS_GROUP_ITSELF
};
/*
Descriptor of log record type
Note: Don't reorder because of constructs later...
......@@ -338,7 +343,7 @@ typedef struct st_log_record_type_descriptor
/* the rest is for maria_read_log & Recovery */
/** @brief for debug error messages or "maria_read_log" command-line tool */
const char *name;
my_bool record_ends_group;
enum enum_record_in_group record_in_group;
/* a function to execute when we see the record during the REDO phase */
int (*record_execute_in_redo_phase)(const TRANSLOG_HEADER_BUFFER *);
/* a function to execute when we see the record during the UNDO phase */
......
This diff is collapsed.
......@@ -22,4 +22,8 @@
/* This is the interface of this module. */
/* Performs recovery of the engine at start */
int recovery();
C_MODE_START
int maria_recover();
int maria_apply_log(LSN lsn, my_bool applyn, FILE *trace_file);
C_MODE_END
......@@ -62,8 +62,8 @@ int maria_rename(const char *old_name, const char *new_name)
this is important; make sure transactionality has been re-enabled.
*/
DBUG_ASSERT(share->now_transactional == share->base.born_transactional);
sync_dir= (share->now_transactional && !share->temporary) ?
MY_SYNC_DIR : 0;
sync_dir= (share->now_transactional && !share->temporary &&
!maria_in_recovery) ? MY_SYNC_DIR : 0;
if (sync_dir)
{
uchar log_data[2 + 2];
......
......@@ -47,7 +47,7 @@ static void copy_key(struct st_maria_info *info,uint inx,
static int verbose=0,testflag=0,
first_key=0,async_io=0,pagecacheing=0,write_cacheing=0,locking=0,
rec_pointer_size=0,pack_fields=1,silent=0,
opt_quick_mode=0, transactional= 0;
opt_quick_mode=0, transactional= 0, skip_update= 0;
static int pack_seg=HA_SPACE_PACK,pack_type=HA_PACK_KEY,remove_count=-1;
static int create_flag= 0, srand_arg= 0;
static ulong pagecache_size=IO_SIZE*16;
......@@ -84,7 +84,24 @@ int main(int argc, char *argv[])
if (! async_io)
my_disable_async_io=1;
maria_init();
maria_data_root= ".";
/* Maria requires that we always have a page cache */
if (maria_init() ||
(init_pagecache(maria_pagecache, pagecache_size, 0, 0,
maria_block_size) == 0) ||
ma_control_file_create_or_open(TRUE) ||
(init_pagecache(maria_log_pagecache,
TRANSLOG_PAGECACHE_SIZE, 0, 0,
TRANSLOG_PAGE_SIZE) == 0) ||
translog_init(maria_data_root, TRANSLOG_FILE_SIZE,
0, 0, maria_log_pagecache,
TRANSLOG_DEFAULT_FLAGS) ||
(transactional && trnman_init()))
{
fprintf(stderr, "Error in initialization");
exit(1);
}
reclength=STANDARD_LENGTH+60+(use_blob ? 8 : 0);
blob_pos=STANDARD_LENGTH+60;
keyinfo[0].seg= &glob_keyseg[0][0];
......@@ -220,22 +237,6 @@ int main(int argc, char *argv[])
goto err;
if (!silent)
printf("- Writing key:s\n");
maria_data_root= ".";
/* Maria requires that we always have a page cache */
if ((init_pagecache(maria_pagecache, pagecache_size, 0, 0,
maria_block_size) == 0) ||
ma_control_file_create_or_open(TRUE) ||
(init_pagecache(maria_log_pagecache,
TRANSLOG_PAGECACHE_SIZE, 0, 0,
TRANSLOG_PAGE_SIZE) == 0) ||
translog_init(maria_data_root, TRANSLOG_FILE_SIZE,
0, 0, maria_log_pagecache,
TRANSLOG_DEFAULT_FLAGS))
{
fprintf(stderr, "Error in initialization");
exit(1);
}
if (locking)
maria_lock_database(file,F_WRLCK);
if (write_cacheing)
......@@ -246,6 +247,14 @@ int main(int argc, char *argv[])
for (i=0 ; i < recant ; i++)
{
ulong blob_length;
#if 0
/*
Starting from i==72, there was a difference between runtime and
log-appplying. This is now fixed, by not using non_header_data_len in
log-applying.
*/
if (i == 72) goto end;
#endif
n1=rnd(1000); n2=rnd(100); n3=rnd(5000);
sprintf(record,"%6d:%4d:%8d:Pos: %4d ",n1,n2,n3,write_count);
int4store(record+STANDARD_LENGTH-4,(long) i);
......@@ -260,7 +269,7 @@ int main(int argc, char *argv[])
printf("Error: %d in write at record: %d\n",my_errno,i);
goto err;
}
if (verbose) printf(" Double key: %d\n",n3);
if (verbose) printf(" Double key: %d at record# %d\n", n3, i);
}
else
{
......@@ -294,7 +303,7 @@ int main(int argc, char *argv[])
if (maria_extra(file,HA_EXTRA_NO_CACHE,0))
{
puts("got error from maria_extra(HA_EXTRA_NO_CACHE)");
goto end;
goto err;
}
}
#ifdef REMOVE_WHEN_WE_HAVE_RESIZE
......@@ -376,6 +385,8 @@ int main(int argc, char *argv[])
else
bmove(record+blob_pos,read_record+blob_pos,8);
}
if (skip_update)
continue;
if (maria_update(file,read_record,record2))
{
if (my_errno != HA_ERR_FOUND_DUPP_KEY || key3[n3] == 0)
......@@ -423,7 +434,7 @@ int main(int argc, char *argv[])
if (memcmp(read_record,read_record2,reclength) != 0)
{
printf("maria_rsame didn't find same record\n");
goto end;
goto err;
}
info.recpos=maria_position(file);
if (maria_rfirst(file,read_record2,0) ||
......@@ -431,7 +442,7 @@ int main(int argc, char *argv[])
memcmp(read_record,read_record2,reclength) != 0)
{
printf("maria_rsame_with_pos didn't find same record\n");
goto end;
goto err;
}
{
info.recpos= maria_position(file);
......@@ -442,7 +453,7 @@ int main(int argc, char *argv[])
info.recpos != maria_position(file))
{
printf("maria_rsame_with_pos lost position\n");
goto end;
goto err;
}
}
ant=1;
......@@ -451,7 +462,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys)
{
printf("next: Found: %d keys of %d\n",ant,dupp_keys);
goto end;
goto err;
}
ant=0;
while (maria_rprev(file,read_record3,0) == 0 &&
......@@ -459,7 +470,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys)
{
printf("prev: Found: %d records of %d\n",ant,dupp_keys);
goto end;
goto err;
}
/* Check of maria_rnext_same */
......@@ -471,7 +482,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys || my_errno != HA_ERR_END_OF_FILE)
{
printf("maria_rnext_same: Found: %d records of %d\n",ant,dupp_keys);
goto end;
goto err;
}
}
......@@ -482,7 +493,7 @@ int main(int argc, char *argv[])
if (maria_rfirst(file,read_record,0))
{
printf("Can't find first record\n");
goto end;
goto err;
}
while ((error=maria_rnext(file,read_record3,0)) == 0 && ant < write_count+10)
ant++;
......@@ -490,7 +501,7 @@ int main(int argc, char *argv[])
{
printf("next: I found: %d records of %d (error: %d)\n",
ant, write_count - opt_delete, error);
goto end;
goto err;
}
if (maria_rlast(file,read_record2,0) ||
bcmp(read_record2,read_record3,reclength))
......@@ -498,7 +509,7 @@ int main(int argc, char *argv[])
printf("Can't find last record\n");
DBUG_DUMP("record2",(byte*) read_record2,reclength);
DBUG_DUMP("record3",(byte*) read_record3,reclength);
goto end;
goto err;
}
ant=1;
while (maria_rprev(file,read_record3,0) == 0 && ant < write_count+10)
......@@ -506,12 +517,12 @@ int main(int argc, char *argv[])
if (ant != write_count - opt_delete)
{
printf("prev: I found: %d records of %d\n",ant,write_count);
goto end;
goto err;
}
if (bcmp(read_record,read_record3,reclength))
{
printf("Can't find first record\n");
goto end;
goto err;
}
if (!silent)
......@@ -552,7 +563,7 @@ int main(int argc, char *argv[])
if (bcmp(read_record+start,key,(uint) i))
{
puts("Didn't find right record");
goto end;
goto err;
}
}
if (dupp_keys > 2)
......@@ -570,7 +581,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys-1)
{
printf("next: I can only find: %d keys of %d\n",ant,dupp_keys-1);
goto end;
goto err;
}
}
if (dupp_keys>4)
......@@ -588,7 +599,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys-2)
{
printf("next: I can only find: %d keys of %d\n",ant,dupp_keys-2);
goto end;
goto err;
}
}
if (dupp_keys > 6)
......@@ -607,7 +618,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys-3)
{
printf("next: I can only find: %d keys of %d\n",ant,dupp_keys-3);
goto end;
goto err;
}
if (!silent)
......@@ -622,7 +633,7 @@ int main(int argc, char *argv[])
if (ant != dupp_keys-4)
{
printf("next: I can only find: %d keys of %d\n",ant,dupp_keys-4);
goto end;
goto err;
}
}
......@@ -655,7 +666,7 @@ int main(int argc, char *argv[])
if (bcmp(read_record,read_record2,reclength) != 0)
{
printf("maria_rsame didn't find same record\n");
goto end;
goto err;
}
}
if (!silent)
......@@ -682,7 +693,7 @@ int main(int argc, char *argv[])
{
printf("maria_records_range returned %ld; Should be about %ld\n",
(long) range_records,(long) info.records);
goto end;
goto err;
}
if (verbose)
{
......@@ -719,7 +730,7 @@ int main(int argc, char *argv[])
{
printf("maria_records_range for key: %d returned %lu; Should be about %lu\n",
i, (ulong) range_records, (ulong) records);
goto end;
goto err;
}
if (verbose && records)
{
......@@ -740,6 +751,7 @@ int main(int argc, char *argv[])
puts("Wrong info from maria_info");
printf("Got: records: %lu delete: %lu i_keys: %d\n",
(ulong) info.records, (ulong) info.deleted, info.keys);
goto err;
}
if (verbose)
{
......@@ -764,7 +776,7 @@ int main(int argc, char *argv[])
if (locking || (!use_blob && !pack_fields))
{
puts("got error from maria_extra(HA_EXTRA_CACHE)");
goto end;
goto err;
}
}
ant=0;
......@@ -777,12 +789,12 @@ int main(int argc, char *argv[])
{
printf("scan with cache: I can only find: %d records of %d\n",
ant,write_count-opt_delete);
goto end;
goto err;
}
if (maria_extra(file,HA_EXTRA_NO_CACHE,0))
{
puts("got error from maria_extra(HA_EXTRA_NO_CACHE)");
goto end;
goto err;
}
ant=0;
......@@ -794,7 +806,7 @@ int main(int argc, char *argv[])
{
printf("scan with cache: I can only find: %d records of %d\n",
ant,write_count-opt_delete);
goto end;
goto err;
}
if (testflag == 4)
......@@ -852,6 +864,15 @@ int main(int argc, char *argv[])
goto err;
}
opt_delete++;
#if 0
/
/*
179 is ok, 180 causes a difference between runtime and log-applying.
This is now fixed (we zero the last directory entry during
log-applying, just to eliminate this irrelevant difference).
*/
if (opt_delete==180) goto end;
#endif
}
else
found_parts++;
......@@ -1021,6 +1042,9 @@ static void get_options(int argc, char **argv)
case 'D':
create_flag|=HA_CREATE_DELAY_KEY_WRITE;
break;
case 'g':
skip_update= TRUE;
break;
case '?':
case 'I':
case 'V':
......
......@@ -6,6 +6,9 @@
# If you want to run this in Valgrind, you should use --trace-children=yes,
# so that it detects problems in ma_test* and not in the shell script
# Running in a "shared memory" disk is 10 times faster; you can do
# mkdir /dev/shm/test; cd /dev/shm/test; maria_path=<path_to_maria_binaries>
# Remove # from following line if you need some more information
#set -x -v -e
......@@ -21,6 +24,7 @@ fi
# Delete temporary files
rm -f *.TMD
rm -f maria_log*
run_tests()
{
......@@ -211,8 +215,14 @@ echo "$maria_path/maria_chk$suffix -sm test2 will warn that 'Datafile is almost
$maria_path/maria_chk$suffix -sm test2 >ma_test2_message.txt 2>&1
cat ma_test2_message.txt
grep "warning: Datafile is almost full" ma_test2_message.txt >/dev/null
rm -f ma_test2_message.txt
$maria_path/maria_chk$suffix -ssm test2
#
# Test that removing tables and applying the log leads to identical tables
#
/bin/sh $maria_path/ma_test_recovery
#
# Some timing tests
#
......
set -e
if [ -z "$maria_path" ]
then
maria_path="."
fi
echo "MARIA RECOVERY TESTS - success is if exit code is 0"
# runs a program inserting/deleting rows, then moves the resulting table
# elsewhere; applies the log and checks that the data file is
# identical to the saved original.
# Does not test the index file as we don't have logging for it yet.
rm -f maria_log*
prog="$maria_path/ma_test1 -M -T --skip-update"
echo "TEST WITH $prog"
$prog
mv -f test1.MAD test1.MAD.good
rm test1.MAI
echo "applying log"
$maria_path/maria_read_log -a > /dev/null
cmp test1.MAD test1.MAD.good
rm -f test1.*
rm -f maria_log*
prog="$maria_path/ma_test2 -s -L -K -W -P -M -T -g"
echo "TEST WITH $prog"
$prog
mv -f test2.MAD test2.MAD.good
rm test2.MAI
echo "applying log"
$maria_path/maria_read_log -a > /dev/null
cmp test2.MAD test2.MAD.good
rm -f test2.*
echo "ALL RECOVERY TESTS OK"
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment