• Monty's avatar
    MDEV-31957 Concurrent ALTER and ANALYZE collecting statistics can result in stale statistical data · e3b36b8f
    Monty authored
    Example of what causes the problem:
    T1: ANALYZE TABLE starts to collect statistics
    T2: ALTER TABLE starts by deleting statistics for all changed fields,
        then creates a temp table and copies data to it.
    T1: ANALYZE ends and writes to the statistics tables.
    T2: ALTER TABLE renames temp table in place of the old table.
    
    Now the statistics from analyze matches the old deleted tables.
    
    Fixed by waiting to delete old statistics until ALTER TABLE is
    the only one using the old table and ensure that rename of columns
    can handle swapping of column names.
    
    rename_columns_in_stat_table() (former rename_column_in_stat_tables())
    now takes a list of columns to rename. It uses the following algorithm
    to update column_stats to be able to handle circular renames
    
    - While there are columns to be renamed and it is the first loop or
      last rename loop did change something.
      - Loop over all columns to be renamed
        - Change column name in column_stat
          - If fail because of duplicate key
          - If this is first change attempt for this column
             - Change column name to a temporary column name
             - If there was a conflicting row, replace it with the current row.
        else
         - Remove entry from column list
    
    - Loop over all remaining columns in the list
     - Remove the conflicting row
     - Change column from temporary name to final name in column_stat
    
    Other things:
    - Don't flush tables for every operation. Only flush when all updates
      are done.
    - Rename of columns was not handled in case of ALGORITHM=copy (old bug).
      - Fixed that we do not collect statistics for hidden hash columns
        used by UNIQUE constraint on long values.
      - Fixed that we do not collect statistics for blob columns referred by
        generated virtual columns. This was achieved by storing the fields for
        which we want to have statistics in table->has_value_set instead of
        in table->read_set.
    - Rename of indexes was not handled for persistent statistics.
      - This is now handled similar as rename of columns. Renamed columns
        are now stored in 'rename_stat_indexes' and handled in
        Alter_info::delete_statistics() together with drooped indexes.
    - ALTER TABLE .. ADD INDEX may instead of creating a new index rename
      an existing generated foreign key index. This was not reflected in
      the index_stats table because this was handled in
      mysql_prepare_create_table instead instead of in the mysql_alter() code.
      Fixed by adding a call in mysql_prepare_create_table() to drop the
      changed index.
      I also had to change the code that 'marked the index' to be ignored
      with code that would not destroy the original index name.
    
    Reviewer: Sergei Petrunia <sergey@mariadb.com>
    e3b36b8f
analyze.test 9.44 KB