• unknown's avatar
    WL#3146 "less locking in auto_increment": · 60272e75
    unknown authored
    this is a cleanup patch for our current auto_increment handling:
    new names for auto_increment variables in THD, new methods to manipulate them
    (see sql_class.h), some move into handler::, causing less backup/restore
    work when executing substatements. 
    This makes the logic hopefully clearer, less work is is needed in
    mysql_insert().
    By cleaning up, using different variables for different purposes (instead
    of one for 3 things...), we fix those bugs, which someone may want to fix
    in 5.0 too:
    BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate
    statement-based"
    BUG#20341 "stored function inserting into one auto_increment puts bad
    data in slave"
    BUG#19243 "wrong LAST_INSERT_ID() after ON DUPLICATE KEY UPDATE"
    (now if a row is updated, LAST_INSERT_ID() will return its id)
    and re-fixes:
    BUG#6880 "LAST_INSERT_ID() value changes during multi-row INSERT"
    (already fixed differently by Ramil in 4.1)
    Test of documented behaviour of mysql_insert_id() (there was no test).
    The behaviour changes introduced are:
    - LAST_INSERT_ID() now returns "the first autogenerated auto_increment value
    successfully inserted", instead of "the first autogenerated auto_increment
    value if any row was successfully inserted", see auto_increment.test.
    Same for mysql_insert_id(), see mysql_client_test.c.
    - LAST_INSERT_ID() returns the id of the updated row if ON DUPLICATE KEY
    UPDATE, see auto_increment.test. Same for mysql_insert_id(), see
    mysql_client_test.c.
    - LAST_INSERT_ID() does not change if no autogenerated value was successfully 
    inserted (it used to then be 0), see auto_increment.test.
    - if in INSERT SELECT no autogenerated value was successfully inserted,
    mysql_insert_id() now returns the id of the last inserted row (it already
    did this for INSERT VALUES), see mysql_client_test.c.
    - if INSERT SELECT uses LAST_INSERT_ID(X), mysql_insert_id() now returns X
    (it already did this for INSERT VALUES), see mysql_client_test.c.
    - NDB now behaves like other engines wrt SET INSERT_ID: with INSERT IGNORE,
    the id passed in SET INSERT_ID is re-used until a row succeeds; SET INSERT_ID
    influences not only the first row now.
    
    Additionally, when unlocking a table we check that the thread is not keeping
    a next_insert_id (as the table is unlocked that id is potentially out-of-date);
    forgetting about this next_insert_id is done in a new
    handler::ha_release_auto_increment().
    
    Finally we prepare for engines capable of reserving finite-length intervals
    of auto_increment values: we store such intervals in THD. The next step
    (to be done by the replication team in 5.1) is to read those intervals from
    THD and actually store them in the statement-based binary log. NDB
    will be a good engine to test that.
    
    
    mysql-test/extra/binlog_tests/binlog.test:
      Testing that if INSERT_ID is set to a value too big for the
      column's type, the binlogged INSERT_ID is the truncated value
      (important if slave has a column of a "wider" numeric type).
      Testing binlogging of INSERT_ID with INSERT DELAYED, to be sure that 
      we binlog an INSERT_ID event only for the delayed rows which use one.
    mysql-test/extra/rpl_tests/rpl_insert_id.test:
      Testcase for BUG#20339 "stored procedure using
      LAST_INSERT_ID() does not replicate statement-based".
      Testcase for BUG#20341 "stored function inserting into one
      auto_increment puts bad data in slave".
    mysql-test/extra/rpl_tests/rpl_loaddata.test:
      Test that LOAD DATA INFILE sets a value for a future LAST_INSERT_ID().
    mysql-test/r/auto_increment.result:
      behaviour change: when INSERT totally fails (not even succeeds
      partially and then rolls back), don't change last_insert_id().
      Behaviour change: LAST_INSERT_ID() is now the first successfully inserted,
      autogenerated, id.
      Behaviour change: if INSERT ON DUPLICATE KEY UPDATE, if the table has auto_increment
      and a row is updated, then LAST_INSERT_ID() returns the id of this row.
    mysql-test/r/binlog_row_binlog.result:
      result update
    mysql-test/r/binlog_stm_binlog.result:
      result update
    mysql-test/r/insert.result:
      result update
    mysql-test/r/rpl_insert_id.result:
      result update
    mysql-test/r/rpl_loaddata.result:
      result update
    mysql-test/r/rpl_ndb_auto_inc.result:
      ndb's behaviour is now like other engines wrt SET INSERT_ID
      in a multi-row INSERT:
      - with INSERT IGNORE: the id passed in SET INSERT_ID is re-used until
      a row succeeds.
      - generally, SET INSERT_ID sets the first value and other values are
      simply computed from this first value, instead of previously where
      the 2nd and subsequent values where not influenced by SET INSERT_ID;
      this good change is due to the removal of "thd->next_insert_id=0"
      from ha_ndbcluster.
    mysql-test/t/auto_increment.test:
      A testcase of BUG#19243: if ON DUPLICATE KEY UPDATE updates a row,
      LAST_INSERT_ID() now returns the id of the row.
      Test of new behaviour of last_insert_id() when no autogenerated value was
      inserted, or when only some autogenerated value (not the first of them) was
      inserted.
    mysql-test/t/insert.test:
      testing INSERT IGNORE re-using generated values
    sql/ha_federated.cc:
      update for new variables.
    sql/ha_ndbcluster.cc:
      handler::auto_increment_column_changed not needed, equivalent to
      (insert_id_for_cur_row > 0).
      thd->next_insert_id=0 not needed anymore; it was used to force
      handler::update_auto_increment() to call ha_ndbcluster::get_auto_increment()
      for each row of a multi-row INSERT, now this happens naturally
      because NDB says "I have reserved you *one* value" in get_auto_increment(),
      so handler::update_auto_increment() calls again for next row.
    sql/handler.cc:
      More comments, use of new methods and variables. Hopes to be clearer
      than current code.
      thd->prev_insert_id not in THD anymore: it is managed locally by inserters
      (like mysql_insert()).
      THD::clear_next_insert_id is now equivalent to
      handler::next_insert_id > 0.
      get_auto_increment() reserves an interval of values from the engine,
      uses this interval for next rows of the statement, until interval
      is exhausted then it asks for another interval (of a bigger size
      than the first one; size doubles until reaching 65535 then it stays constant).
      If doing statement-based binlogging, intervals are remembered in a list
      for storage in the binlog.
      For "forced" insert_id values (SET INSERT_ID or replication slave),
      forced_auto_inc_intervals is non-empty and the handler takes its intervals
      from there, without calling get_auto_increment().
      ha_release_auto_increment() resets the handler's auto_increment variables;
      it calls release_auto_increment() which is handler-dependent and
      serves to return to the engine any unused tail of the last used
      interval.
      If ending a statement, next_insert_id>0 means that autoinc values have been
      generated or taken from the master's binlog (in a replication slave) so
      we clear those values read from binlog, so that next top- or sub-
      statement does not use them.
    sql/handler.h:
      handler::auto_increment_changed can be replaced by
      (handler::insert_id_for_cur_row > 0).
      THD::next_insert_id moves into handler (more natural, and prepares
      for the day when we'll support a single statement inserting into
      two tables - "multi-table INSERT" like we have UPDATE - will this
      happen?).
      This move makes the backup/restore of THD::next_insert_id when entering
      a substatement unneeded, as each substatement has its own handler
      objects.
    sql/item_func.cc:
      new names for variables.
      For the setting of what mysql_insert_id() will return to the client,
      LAST_INSERT_ID(X) used to simply pretend that the generated autoinc
      value for the current row was X, but this led to having no reliable
      way to know the really generated value, so we now have a bool:
      thd->arg_of_last_insert_id_function which enables us to know that
      LAST_INSERT_ID(X) was called (and then X can be found in
      thd->first_successful_insert_id_in_prev_stmt).
    sql/log.cc:
      new variable names for insert_ids. Removing some unused variables in the slow
      log.
    sql/log_event.cc:
      new variable names, comments. Preparing for when master's won't binlog
      LAST_INSERT_ID if it was 0.
    sql/set_var.cc:
      new variable names.
      The last change repeats how Bar fixed BUG#20392
      "INSERT_ID session variable has weird value" in 5.0.
    sql/sql_class.cc:
      new variables for insert_id. In THD::cleanup_after_query() we fix
      BUG#20339 "stored procedure using LAST_INSERT_ID() does not replicate
      statement-based" (will one want to fix it in 5.0?). Many comments
      about what stored functions do to auto_increment.
      In reset|restore_sub_statement_state(), we need to backup less
      auto_inc variables as some of them have moved to the handler;
      we backup/restore those which are about the current top- or sub-
      statement, *not* those about the statement-based binlog
      (which evolve as the top- and sub-statement execute).
      Because we split THD::last_insert_id into 
      THD::first_successful_insert_id_in_prev_stmt and
      THD::auto_inc_intervals_for_binlog (among others), we fix
      BUG#20341 "stored function inserting into one auto_increment
      puts bad data in slave": indeed we can afford to not backup/restore
      THD::auto_inc_intervals_for_binlog (which fixes the bug) while still
      backing up / restoring THD::first_successful_insert_id_in_prev_stmt
      (ensuring that the top-level LAST_INSERT_ID() is not affected by INSERTs
      done by sub-statements, as is desirable and tested in rpl_insert_id.test).
    sql/sql_class.h:
      new variables and methods for auto_increment.
      Some THD members move into handler (those which are really about
      the table being inserted), some stay in THD (those which are
      about what a future LAST_INSERT_ID() should return, or about
      what should be stored into the statement-based binlog).
      THD::next_insert_id moves to handler::.
      THD::clear_next_insert_id removed (had become equivalent
      to next_insert_id > 0).
      THD::last_insert_id becomes four:
      THD::first_successful_insert_id_in_cur_stmt,
      THD::auto_inc_intervals_for_binlog,
      handler::insert_id_for_cur_row,
      THD::first_successful_insert_id_in_prev_stmt.
      THD::current_insert_id becomes:
      THD::first_successful_insert_id_in_prev_stmt_for_binlog
      THD::prev_insert_id is removed, handler can just use
      handler::insert_id_for_cur_row instead (which is more accurate:
      for the first row, prev_insert_id was set before get_auto_increment
      was called, so was 0, causing a call to
      get_auto_increment() for the 2nd row if the 1st row fails;
      here we don't need the call as insert_id_for_cur_row has
      the value of the first row).
      THD::last_insert_id_used becomes: stmt_depends_on_first_row_in_prev_stmt
      THD::insert_id_used is removed (equivalent to
      auto_inc_intervals_for_binlog non empty).
      The interval returned by get_auto_increment() and currently being
      consumed is handler::auto_inc_interval_for_cur_row.
      Comments to explain each of them.
      select_insert::last_insert_id becomes autoinc_value_of_last_inserted_row.
    sql/sql_insert.cc:
      the "id" variable is not changed for each row now; it used to compensate for
      this contradiction:
      - thd->last_insert_id supposed job was to keep the id of the first row
      - but it was updated for every row
      - so mysql_insert() made sure to catch its first value and restore it at the end of stmt.
      Now THD keeps the first value in first_successful_insert_id_in_cur_stmt,
      and value of the row in insert_id_for_cur_row. So "id" only serves to fill
      mysql_insert_id(), as depending on some conditions, "id" must be different
      values.
      Prev_insert_id moves from THD to write_record().
      We now set LAST_INSERT_ID() in ON DUPLICATE KEY UPDATE too (BUG#19243).
      In an INSERT DELAYED, we still "reset auto-increment caching" but differently
      (by calling ha_release_auto_increment()).
    sql/sql_load.cc:
      no need to fiddle with "id", THD maintains
      THD::first_successful_insert_id_in_cur_stmt by itself and correctly now.
      ha_release_auto_increment() is now (logically) called before we unlock
      the table.
    sql/sql_parse.cc:
      update to new variable names.
      Assertion that reset_thd_for_next_command() is not called for every
      substatement of a routine (I'm not against it, but if we do this change,
      statement-based binlogging needs some adjustments).
    sql/sql_select.cc:
      update for new variable names
    sql/sql_table.cc:
      next_insert_id not needed in mysql_alter_table(), THD manages.
    sql/sql_update.cc:
      update for new variable names.
      Even though this is UPDATE, an insert id can be generated (by
      LAST_INSERT_ID(X)) and should be recorded because mysql_insert_id() wants
      to know about it.
    sql/structs.h:
      A class for "discrete" intervals (intervals of integer numbers with a certain
      increment between them): Discrete_interval, and a class for a list of such
      intervals: Discrete_intervals_list
    tests/mysql_client_test.c:
      tests of behaviour of mysql_insert_id(): there were no such tests, while in
      our manual we document its behaviour. In comments you'll notice the behaviour
      changes introduced (there are 5).
    60272e75
auto_increment.test 10 KB