Commits · 96d7294586dbd80aa48902138e3e42aec8977d90 · nexedi / MariaDB

14 Jun, 2020 7 commits

Fixed access of undefined memory for compressed MyISAM and Aria tables · 96d72945

Monty authored Jun 03, 2020

MDEV-22689 MSAN use-of-uninitialized-value in decode_bytes()

This was not a user visible issue as the huffman code lookup tables would
automatically ignore any of the unitialized bits

Fixed by adding a end-zero byte to the bit-stream buffer.

Other things:
- Fixed a (for this case) wrong assert in strmov() for myisamchk
  and aria_chk by removing the strmov()

96d72945

Make error messages from DROP TABLE and DROP TABLE IF EXISTS consistent · dfb41fdd

Monty authored Jun 05, 2020

- IF EXISTS ends with a list of all not existing object, instead of a
  separate note for every not existing object
- Produce a "Note" for all wrongly dropped objects
  (like trying to do DROP SEQUENCE for a normal table)
- Do not write existing tables that could not be dropped to binlog

Other things:
MDEV-22820 Bogus "Unknown table" warnings produced upon attempt to drop
           parent table referenced by FK
This was caused by an older version of this commit patch and later fixed

dfb41fdd

Fixed error messages from DROP VIEW to align with DROP TABLE · 346d10a9

Monty authored Jun 05, 2020

- Produce a "Note" for all wrongly dropped objects
  (Like doing DROP VIEW on a table).
- IF EXISTS ends with a list of all not existing objects, instead of a
  separate note for every not existing object.

Other things:
 - Fixed bug where one could do CREATE TEMPORARY SEQUENCE multiple times
   and create multiple temporary sequences with the same name.

346d10a9

MDEV-11412 Ensure that table is truly dropped when using DROP TABLE · 5bcb1d65

Monty authored Jun 01, 2020

The used code is largely based on code from Tencent

The problem is that in some rare cases there may be a conflict between .frm
files and the files in the storage engine. In this case the DROP TABLE
was not able to properly drop the table.

Some MariaDB/MySQL forks has solved this by adding a FORCE option to
DROP TABLE. After some discussion among MariaDB developers, we concluded
that users expects that DROP TABLE should always work, even if the
table would not be consistent. There should not be a need to use a
separate keyword to ensure that the table is really deleted.

The used solution is:
- If a .frm table doesn't exists, try dropping the table from all storage
  engines.
- If the .frm table exists but the table does not exist in the engine
  try dropping the table from all storage engines.
- Update storage engines using many table files (.CVS, MyISAM, Aria) to
  succeed with the drop even if some of the files are missing.
- Add HTON_AUTOMATIC_DELETE_TABLE to handlerton's where delete_table()
  is not needed and always succeed. This is used by ha_delete_table_force()
  to know which handlers to ignore when trying to drop a table without
  a .frm file.

The disadvantage of this solution is that a DROP TABLE on a non existing
table will be a bit slower as we have to ask all active storage engines
if they know anything about the table.

Other things:
- Added a new flag MY_IGNORE_ENOENT to my_delete() to not give an error
  if the file doesn't exist. This simplifies some of the code.
- Don't clear thd->error in ha_delete_table() if there was an active
  error. This is a bug fix.
- handler::delete_table() will not abort if first file doesn't exists.
  This is bug fix to handle the case when a drop table was aborted in
  the middle.
- Cleaned up mysql_rm_table_no_locks() to ensure that if_exists uses
  same code path as when it's not used.
- Use non_existing_Table_error() to detect if table didn't exists.
  Old code used different errors tests in different position.
- Table_triggers_list::drop_all_triggers() now drops trigger file if
  it can't be parsed instead of leaving it hanging around (bug fix)
- InnoDB doesn't anymore print error about .frm file out of sync with
  InnoDB directory if .frm file does not exists. This change was required
  to be able to try to drop an InnoDB file when .frm doesn't exists.
- Fixed bug in mi_delete_table() where the .MYD file would not be dropped
  if the .MYI file didn't exists.
- Fixed memory leak in Mroonga when deleting non existing table
- Fixed memory leak in Connect when deleting non existing table

Bugs fixed introduced by the original version of this commit:
MDEV-22826 Presence of Spider prevents tables from being force-deleted from
           other engines

5bcb1d65

MDEV-22884: Adjust the test for PLUGIN_PERFSCHEMA=NO · 5579c389
Marko Mäkelä authored Jun 14, 2020

5579c389

MDEV-22889: Disable innodb.innodb_force_recovery_rollback · ad5edf3c

Marko Mäkelä authored Jun 14, 2020

The test case that was added for MDEV-21217
(commit b68f1d84)
should have only two possible outcomes for the locking SELECT statement:

(1) The statement is blocked, and the test will eventually fail
with a lock wait timeout. This is what I observed when the
code fix for MDEV-21217 was missing.

(2) The lock conflict will ensure that the statement will execute
after the rollback has completed, and an empty table will be observed.
This is the expected outcome with the recovery fix.

What occasionally happens (in some of our CI environments only, so far)
is that the locking SELECT will return all 1,000 rows of the table that
had been inserted by the transaction that was never supposed to be
committed. One possibility is that the transaction was unexpectedly
committed when the server was killed.

Let us disable the test until the reason of the failure has been
determined and addressed.

ad5edf3c

Merge 10.4 into 10.5 · 3dbc49f0
Marko Mäkelä authored Jun 14, 2020

3dbc49f0

13 Jun, 2020 7 commits

MDEV-22884 Assertion `grant_table || grant_table_role' failed on perfschema · 9ed08f35
Sergei Golubchik authored Jun 13, 2020
```
when allowing access via perfschema callbacks, update
the cached GRANT_INFO to match
```
9ed08f35

MDEV-21560 Assertion `grant_table || grant_table_role' failed in check_grant_all_columns · b58586aa

Sergei Golubchik authored Jun 13, 2020

With RETURNING it can happen that the user has some privileges on
the table (namely, DELETE), but later needs different privileges
on individual columns (namely, SELECT).

Do the same as in check_grant_column() - ER_COLUMNACCESS_DENIED_ERROR,
not an assert.

b58586aa

Merge 10.3 into 10.4 · 80534093
Marko Mäkelä authored Jun 13, 2020

80534093
Merge 10.2 into 10.3 · d83a4432
Marko Mäkelä authored Jun 13, 2020

d83a4432
MDEV-21217 innodb_force_recovery=2 may wrongly abort rollback · b68f1d84
Marko Mäkelä authored Jun 13, 2020
```
trx_roll_must_shutdown(): Correct the condition that detects
the start of shutdown.
```
b68f1d84

MDEV-22190 InnoDB: Apparent corruption of an index page ... to be written · 574ef380

Marko Mäkelä authored Jun 13, 2020

An InnoDB check for the validity of index pages would occasionally fail
in the test encryption.innodb_encryption_discard_import.

An analysis of a "rr replay" failure trace revealed that the problem
basically is a combination of two old anomalies, and a recently
implemented optimization in MariaDB 10.5.

MDEV-15528 allows InnoDB to discard buffer pool pages that were freed.

PageBulk::init() will disable the InnoDB validity check, because
during native ALTER TABLE (rebuilding tables or creating indexes)
we could write inconsistent index pages to data files.

In the occasional test failure, page 8:6 would have been written
from the buffer pool to the data file and subsequently freed.

However, fil_crypt_thread may perform dummy writes to pages that
have been freed. In case we are causing an inconsistent page to
be re-encrypted on page flush, we should disable the check.

In the analyzed "rr replay" trace, a fil_crypt_thread attempted
to access page 8:6 twice after it had been freed.
On the first call, buf_page_get_gen(..., BUF_PEEK_IF_IN_POOL, ...)
returned NULL. The second call succeeded, and shortly thereafter,
the server intentionally crashed due to writing the corrupted page.

574ef380

MDEV-22268 virtual longlong Item_func_div::int_op(): Assertion `0' failed in Item_func_div::int_op · 6c30bc21

Alexander Barkov authored Jun 13, 2020

Item_func_div::fix_length_and_dec_temporal() set the return data type to
integer in case of @div_precision_increment==0 for temporal input with FSP=0.
This caused Item_func_div to call int_op(), which is not implemented,
so a crash on DBUG_ASSERT(0) happened.

Fixing fix_length_and_dec_temporal() to set the result type to DECIMAL.

6c30bc21

12 Jun, 2020 20 commits

when printing Item_in_optimizer, use precedence of wrapped Item · 114a8436

Sidney Cammeresi authored Jun 12, 2020

when Item::print() is called with the QT_PARSABLE flag, WHERE i NOT IN
(SELECT ...) gets printed as WHERE !i IN (SELECT ...) instead of WHERE
!(i in (SELECT ...)) because Item_in_optimizer returns DEFAULT_PRECEDENCE.
it should return the precedence of the inner operation.

114a8436

MDEV-22840: JSON_ARRAYAGG gives wrong results with NULL values and ORDER by clause · ab9bd628

Varun Gupta authored Jun 09, 2020

The problem here is similar to the case with DISTINCT, the tree used for ORDER BY
needs to also hold the null bytes of the record. This was not done for GROUP_CONCAT
as NULLS are rejected by GROUP_CONCAT.

Also introduced a comparator function for the order by tree to handle null
values with JSON_ARRAYAGG.

ab9bd628

MDEV-22011: DISTINCT with JSON_ARRAYAGG gives wrong results · 0f6f0daa

Varun Gupta authored Jun 09, 2020

For DISTINCT to be handled with JSON_ARRAYAGG, we need to make sure
that the Unique tree also holds the NULL bytes of a table record
inside the node of the tree. This behaviour for JSON_ARRAYAGG is
different from GROUP_CONCAT because in GROUP_CONCAT we just reject
NULL values for columns.

Also introduced a comparator function for the unique tree to handle null
values for distinct inside JSON_ARRAYAGG.

0f6f0daa

MDEV-11563: GROUP_CONCAT(DISTINCT ...) may produce a non-distinct list · a006e88c

Varun Gupta authored Mar 21, 2020

Backported from MYSQL
Bug #25331425: DISTINCT CLAUSE DOES NOT WORK IN GROUP_CONCAT
Issue:
------
The problem occurs when:
1) GROUP_CONCAT (DISTINCT ....) is used in the query.
2) Data size greater than value of system variable:
tmp_table_size.

The result would contain values that are non-unique.

Root cause:
-----------
An in-memory structure is used to filter out non-unique
values. When the data size exceeds tmp_table_size, the
overflow is written to disk as a separate file. The
expectation here is that when all such files are merged,
the full set of unique values can be obtained.

But the Item_func_group_concat::add function is in a bit of
hurry. Even as it is adding values to the tree, it wants to
decide if a value is unique and write it to the result
buffer. This works fine if the configured maximum size is
greater than the size of the data. But since tmp_table_size
is set to a low value, the size of the tree is smaller and
hence requires the creation of multiple copies on disk.

Item_func_group_concat currently has no mechanism to merge
all the copies on disk and then generate the result. This
results in duplicate values.

Solution:
---------
In case of the DISTINCT clause, don't write to the result
buffer immediately. Do the merge and only then put the
unique values in the result buffer. This has be done in
Item_func_group_concat::val_str.

Note regarding result file changes:
-----------------------------------
Earlier when a unique value was seen in
Item_func_group_concat::add, it was dumped to the output.
So result is in the order stored in SE. But with this fix,
we wait until all the data is read and the final set of
unique values are written to output buffer. So the data
appears in the sorted order.

This only fixes the cases when we have DISTINCT without ORDER BY clause
in GROUP_CONCAT.

a006e88c

MDEV-15101: Stop ANALYZE TABLE from flushing table definition cache · fd1755e4
Sergei Petrunia authored Jun 12, 2020
```
Part#2: forgot to commit the adjustments for the testcases.
```
fd1755e4

MDEV-22867 Assertion instant.n_core_fields == n_core_fields failed · 43120009

Marko Mäkelä authored Jun 12, 2020

This is a race condition where a table on which a 10.3-style
instant ADD COLUMN is emptied during the execution of
ALTER TABLE ... DROP COLUMN ..., DROP INDEX ..., ALGORITHM=NOCOPY.

In commit 2c4844c9 the
function instant_metadata_lock() would prevent this race condition.
But, it would also hold a page latch on the leftmost leaf page of
clustered index for the duration of a possible DROP INDEX operation.

The race could be fixed by restoring the function
instant_metadata_lock() that was removed in
commit ea37b144
but it would be more future-proof to prevent the
dict_index_t::clear_instant_add() call from being issued at all.

We at some point support DROP COLUMN ..., ADD INDEX ..., ALGORITHM=NOCOPY
and that would spend a non-trivial amount of
execution time in ha_innobase::inplace_alter(),
making a server hang possible. Currently this is not supported
and our added test case will notice when the support is introduced.

dict_index_t::must_avoid_clear_instant_add(): Determine if
a call to clear_instant_add() must be avoided.

btr_discard_only_page_on_level(): Preserve the metadata record
if must_avoid_clear_instant_add() holds.

btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Do not remove the metadata record even if the table becomes empty
but must_avoid_clear_instant_add() holds.

btr_pcur_store_position(): Relax a debug assertion.

This is joint work with Thirunarayanan Balathandayuthapani.

43120009

MDEV-15101: Stop ANALYZE TABLE from flushing table definition cache · d7d80689

Sergei Petrunia authored Jun 12, 2020

Apply this patch from Percona Server (amended for 10.5):

commit cd7201514fee78aaf7d3eb2b28d2573c76f53b84
Author: Laurynas Biveinis <laurynas.biveinis@gmail.com>
Date:   Tue Nov 14 06:34:19 2017 +0200

    Fix bug 1704195 / 87065 / TDB-83 (Stop ANALYZE TABLE from flushing table definition cache)

    Make ANALYZE TABLE stop flushing affected tables from the table
    definition cache, which has the effect of not blocking any subsequent
    new queries involving the table if there's a parallel long-running
    query:

    - new table flag HA_ONLINE_ANALYZE, return it for InnoDB and TokuDB
      tables;
    - in mysql_admin_table, if we are performing ANALYZE TABLE, and the
      table flag is set, do not remove the table from the table
      definition cache, do not invalidate query cache;
    - in partitioning handler, refresh the query optimizer statistics
      after ANALYZE if the underlying handler supports HA_ONLINE_ANALYZE;
    - new testcases main.percona_nonflushing_analyze_debug,
      parts.percona_nonflushing_abalyze_debug and a supporting debug sync
      point.

    For TokuDB, this change exposes bug TDB-83 (Index cardinality stats
    updated for handler::info(HA_STATUS_CONST), not often enough for
    tokudb_cardinality_scale_percent). TokuDB may return different
    rec_per_key values depending on dynamic variable
    tokudb_cardinality_scale_percent value. The server does not have a way
    of knowing that changing this variable invalidates the previous
    rec_per_key values in any opened table shares, and so does not call
    info(HA_STATUS_CONST) again. Fix by updating rec_per_key for both
    HA_STATUS_CONST and HA_STATUS_VARIABLE. This also forces a re-record
    of tokudb.bugs.db756_card_part_hash_1_pick, with the new output
    seeming to be more correct.

d7d80689

MDEV-8139: Fix the MSAN instrumentation · d34cc6b3
Thirunarayanan Balathandayuthapani authored Jun 12, 2020

d34cc6b3

MDEV-22877 Avoid unnecessary buf_pool.page_hash S-latch acquisition · d2c593c2

Marko Mäkelä authored Jun 12, 2020

MDEV-15053 did not remove all unnecessary buf_pool.page_hash S-latch
acquisition. There are code paths where we are holding buf_pool.mutex
(which will sufficiently protect buf_pool.page_hash against changes)
and unnecessarily acquire the latch. Many invocations of
buf_page_hash_get_locked() can be replaced with the much simpler
buf_pool.page_hash_get_low().

In the worst case the thread that is holding buf_pool.mutex will become
a victim of MDEV-22871, suffering from a spurious reader-reader conflict
with another thread that genuinely needs to acquire a buf_pool.page_hash
S-latch.

In many places, we were also evaluating page_id_t::fold() while holding
buf_pool.mutex. Low-level functions such as buf_pool.page_hash_get_low()
must get the page_id_t::fold() as a parameter.

buf_buddy_relocate(): Defer the hash_lock acquisition to the critical
section that starts by calling buf_page_t::can_relocate().

d2c593c2

more mysql_create_view link/unlink woes · 0b5dc626
Sergei Golubchik authored Jun 12, 2020

0b5dc626
MDEV-22878 galera.wsrep_strict_ddl hangs in 10.5 after merge · fb70eb77
Sergei Golubchik authored Jun 12, 2020
```
if mysql_create_view is aborted when `view` isn't unlinked,
it should not be linked back on cleanup
```
fb70eb77
MDEV-21851: post-push to fix main.flush_read_lock. · efa67ee0
Andrei Elkin authored Jun 12, 2020

efa67ee0
MDEV-16470: switch off user variables (and fixes of its support) · 82f3ceed
Oleksandr Byelkin authored Jun 11, 2020

82f3ceed

MDEV-22834: Disks plugin - change datatype to bigint · 8ec21afc

Vicențiu Ciorbaru authored Jun 10, 2020

On large hard disks (> 2TB), the plugin won't function correctly, always
showing 2 TB of available space due to integer overflow. Upgrade table
fields to bigint to resolve this problem.

8ec21afc

MDEV-21851: Error in BINLOG_BASE64_EVENT i s always error-logged as if it is done by Slave · e156a8da
Andrei Elkin authored Mar 01, 2020
```
The prefix of error log message out of a failed BINLOG applying
is corrected to be the sql command name.
```
e156a8da

MDEV-22602 Disable UPDATE CASCADE for SQL constraints · 762bf7a0

Aleksey Midenkov authored Jun 12, 2020

CHECK constraint is checked by check_expression() which walks its
items and gets into Item_field::check_vcol_func_processor() to check
for conformity with foreign key list.

WITHOUT OVERLAPS is checked for same conformity in
mysql_prepare_create_table().

Long uniques are already impossible with InnoDB foreign keys. See
ER_CANT_CREATE_TABLE in test case.

2 accompanying bugs fixed (test main.constraints failed):

1. check->name.str lived on SP execute mem_root while "check" obj
itself lives on SP main mem_root. On second SP execute check->name.str
had garbage data. Fixed by allocating from thd->stmt_arena->mem_root
which is SP main mem_root.

2. CHECK_CONSTRAINT_IF_NOT_EXISTS value was mixed with
VCOL_FIELD_REF. VCOL_FIELD_REF is assigned in check_expression() and
then detected as CHECK_CONSTRAINT_IF_NOT_EXISTS in
handle_if_exists_options().

Existing cases for MDEV-16932 in main.constraints cover both fixes.

762bf7a0

Fix wrong merge of commit d218d1aa · 2fd2fd77
Vicențiu Ciorbaru authored Jun 12, 2020

2fd2fd77
MDEV-22119: main.innodb_ext_key fails sporadically · 02c255d1
Varun Gupta authored Jun 12, 2020
```
Made the test stable by adding more rows so the range scan is cheaper than table scan.
```
02c255d1

MDEV-22499 Assertion `(uint) (table_check_constraints -... · f9e53a65

Alexander Barkov authored Jun 12, 2020

MDEV-22499 Assertion `(uint) (table_check_constraints - share->check_constraints) == (uint) (share->table_check_constraints - share->field_check_constraints)' failed in TABLE_SHARE::init_from_binary_frm_image

The patch for MDEV-22111 fixed MDEV-22499 as well.
Adding tests only.

f9e53a65

MDEV-8139 Fix Scrubbing · c92f7e28

Thirunarayanan Balathandayuthapani authored Jun 11, 2020

fil_space_t::freed_ranges: Store ranges of freed page numbers.

fil_space_t::last_freed_lsn: Store the most recent LSN of
freeing a page.

fil_space_t::freed_mutex: Protects freed_ranges, last_freed_lsn.

fil_space_create(): Initialize the freed_range mutex.

fil_space_free_low(): Frees the freed_range mutex.

range_set: Ranges of page numbers.

buf_page_create(): Removes the page from freed_ranges when page
is being reused.

btr_free_root(): Remove the PAGE_INDEX_ID invalidation. Because
btr_free_root() and dict_drop_index_tree() are executed in
the same atomic mini-transaction, there is no need to
invalidate the root page.

buf_release_freed_page(): Split from buf_flush_freed_page().
Skip any I/O

buf_flush_freed_pages(): Get the freed ranges from tablespace and
Write punch-hole or zeroes of the freed ranges.

buf_flush_try_neighbors(): Handles the flushing of freed ranges.

mtr_t::freed_pages: Variable to store the list of freed pages.

mtr_t::add_freed_pages(): To add freed pages.

mtr_t::clear_freed_pages(): To clear the freed pages.

mtr_t::m_freed_in_system_tablespace: Variable to indicate whether page has
been freed in system tablespace.

mtr_t::m_trim_pages: Variable to indicate whether the space has been trimmed.

mtr_t::commit(): Add the freed page and update the last freed lsn
in the tablespace and clear the tablespace freed range if space is
trimmed.

file_name_t::freed_pages: Store the freed pages during recovery.

file_name_t::add_freed_page(), file_name_t::remove_freed_page(): To
add and remove freed page during recovery.

store_freed_or_init_rec(): Store or remove the freed pages while
encountering FREE_PAGE or INIT_PAGE redo log record.

recv_init_crash_recovery_spaces(): Add the freed page encountered
during recovery to respective tablespace.

c92f7e28

11 Jun, 2020 6 commits

post-fix for #1504 · 07d1c856
Sergei Golubchik authored Jun 10, 2020

07d1c856

MDEV-22812 "failed to create symbolic link" during the build · d3f47482

Sergei Golubchik authored Jun 11, 2020

as cmake manual says

  If a sequential execution of multiple commands is required, use multiple
  ``execute_process()`` calls with a single ``COMMAND`` argument.

d3f47482

Merge branch '10.1' into 10.2 · 8c67ffff
Vicențiu Ciorbaru authored Jun 11, 2020

8c67ffff

MDEV-21831: Assertion `length == pack_length()' failed in... · 35acf39b

Varun Gupta authored Jun 11, 2020

MDEV-21831: Assertion `length == pack_length()' failed in Field_inet6::sort_string upon INSERT into RocksDB table

For INET6 columns the values are stored as BINARY columns and returned to the client in TEXT format.
For rocksdb the indexes store mem-comparable images for columns, so use the pack_length() to store
the mem-comparable form for INET6 columns. This would also remain consistent with CHAR columns.

35acf39b

MDEV-22850 Reduce buf_pool.page_hash latch contention · 757e756d

Marko Mäkelä authored Jun 11, 2020

For reads, the buf_pool.page_hash is protected by buf_pool.mutex or
by the hash_lock. There is no need to compute or acquire hash_lock
if we are not modifying the buf_pool.page_hash.

However, the buf_pool.page_hash latch must be held exclusively
when changing buf_page_t::in_file(), or if we desire to prevent
buf_page_t::can_relocate() or buf_page_t::buf_fix_count()
from changing.

rw_lock_lock_word_decr(): Add a comment that explains the polling logic.

buf_page_t::set_state(): When in_file() is to be changed, assert that
an exclusive buf_pool.page_hash latch is being held. Unfortunately
we cannot assert this for set_state(BUF_BLOCK_REMOVE_HASH) because
set_corrupt_id() may already have been called.

buf_LRU_free_page(): Check buf_page_t::can_relocate() before
aqcuiring the hash_lock.

buf_block_t::initialise(): Initialize also page.buf_fix_count().

buf_page_create(): Initialize buf_fix_count while not holding
any mutex or hash_lock. Acquire the hash_lock only for the
duration of inserting the block to the buf_pool.page_hash.

buf_LRU_old_init(), buf_LRU_add_block(),
buf_page_t::belongs_to_unzip_LRU(): Do not assert buf_page_t::in_file(),
because buf_page_create() will invoke buf_LRU_add_block()
before acquiring hash_lock and buf_page_t::set_state().

buf_pool_t::validate(): Rely on the buf_pool.mutex and do not
unnecessarily acquire any buf_pool.page_hash latches.

buf_page_init_for_read(): Clarify that we must acquire the hash_lock
upfront in order to prevent a race with buf_pool_t::watch_remove().

757e756d

MDEV-21619 Server crash or assertion failures in my_datetime_to_str · e835881c

Alexander Barkov authored Jun 11, 2020

Item_cache_datetime::decimals was always copied from example->decimals
without limiting to 6 (maximum possible fractional digits), so
val_str() later crashed on asserts inside my_time_to_str() and
my_datetime_to_str().

e835881c