-
unknown authored
minimize writes to transactional Maria tables: don't write data pages, state, and open_count at the end of each statement. Data pages will be written by a background thread periodically. State will be written by Checkpoint periodically. open_count serves to detect when a table is potentially damaged due to an unclean mysqld stop, but thanks to recovery an unclean mysqld stop will be corrected and so open_count becomes useless. As state is written less often, it is often obsolete on disk, we thus should avoid to read it from disk. - by removing the data page writes above, it is necessary to put it back at the start of some statements like check, repair and delete_all. It was already necessary in fact (see ma_delete_all.c). - disabling CACHE INDEX on Maria tables for now (fixes crash of test 'key_cache' when run with --default-storage-engine=maria). - correcting some fishy code in maria_extra.c (we possibly could lose index pages when doing a DROP TABLE under Windows, in theory). storage/maria/ha_maria.cc: disable CACHE INDEX in Maria for now (there is a single cache for now), it crashes and it's not a priority storage/maria/ma_bitmap.c: debug message storage/maria/ma_check.c: The statement before maria_repair() may not flush state, so it needs to be done by maria_repair() (indeed this function uses maria_open(HA_OPEN_COPY) so reads state from disk, so needs to find it up-to-date on disk). For safety (but normally this is not needed) we remove index blocks out of the cache before repairing. _ma_flush_blocks() becomes _ma_flush_table_files_after_repair(): it now additionally flushes the data file and state and syncs files. As a side effect, the assertion "no WRITE_CACHE_USED" from _ma_flush_table_files() fired so we move all end_io_cache() done at the end of repair to before the calls to _ma_flush_table_files_after_repair(). storage/maria/ma_close.c: when closing a transactional table, we fsync it. But we need to do this only after writing its state. We need to write the state at close time only for transactional tables (the other tables do that at last unlock). Putting back the O_RDONLY||crashed condition which I had removed earlier. Unmap the file before syncing it (does not matter now as Maria does not use mmap) storage/maria/ma_delete_all.c: need to flush data pages before chsize-ing it. Was needed even when we flushed data pages at the end of each statement, because we didn't anyway do it if under LOCK TABLES: the change here thus fixes this bug: create table t(a int) engine=maria;lock tables t write; insert into t values(1);delete from t;unlock tables;check table t; "Size of datafile is: 16384 Should be: 8192" (an obsolete page went to disk after the chsize(), at unlock time). storage/maria/ma_extra.c: When doing share->last_version=0, we make the MARIA_SHARE-in-memory invisible to future openers, so need to have an up-to-date state on disk for them. The same way, future openers will reopen the data and index file, so they will not find our cached blocks, so we need to flush them to disk. In HA_EXTRA_FORCE_REOPEN, this probably happens naturally as all tables normally get closed, we however add a safety flush. In HA_EXTRA_PREPARE_FOR_RENAME, we need to do the flushing. On Windows we additionally need to close files. In HA_EXTRA_PREPARE_FOR_DROP, we don't need to flush anything but remove dirty cached blocks from memory. On Windows we need to close files. Closing files forces us to sync them before (requirement for transactional tables). For mutex reasons (don't lock intern_lock twice), we move maria_lock_database() and _ma_decrement_open_count() first in the list of operations. Flush also data file in HA_EXTRA_FLUSH. storage/maria/ma_locking.c: For transactional tables: - don't write data pages / state at unlock time; as a consequence, "share->changed=0" cannot be done. - don't write state in _ma_writeinfo() - don't maintain open_count on disk (Recovery corrects the table in case of crash anyway, and we gain speed by not writing open_count to disk), For non-transactional tables, flush the state at unlock only if the table was changed (optimization). Code which read the state from disk is relevant only with external locking, we disable it (if want to re-enable it, it shouldn't for transactional tables as state on disk may be obsolete (such tables does not flush state at unlock anymore). The comment "We have to flush the write cache" is now wrong because maria_lock_database(F_UNLCK) now happens before thr_unlock(), and we are not using external locking. storage/maria/ma_open.c: _ma_state_info_read() is only used in ma_open.c, making it static storage/maria/ma_recovery.c: set MARIA_SHARE::changed to TRUE when we are going to apply a REDO/UNDO, so that the state gets flushed at close. storage/maria/ma_test_recovery.expected: Changes introduced by this patch: - good: the "open" (table open, not properly closed) is gone, it was pointless for a recovered table - bad: stemming from different moments of writing the index's state probably (_ma_writeinfo() used to write the state after every row write in ma_test* programs, doesn't anymore as the table is transactional): some differences in indexes (not relevant as we don't yet have recovery for them); some differences in count of records (changed from a wrong value to another wrong value) (not relevant as we don't recover this count correctly yet anyway, though a patch will be pushed soon). storage/maria/ma_test_recovery: for repeatable output, no names of varying directories. storage/maria/maria_chk.c: function renamed storage/maria/maria_def.h: Function became local to ma_open.c. Function renamed.
f456b30c