• Guilhem Bichot's avatar
    Fix for BUG#39363 "Concurent inserts in the same table lead to hang in maria engine" · c9bb8999
    Guilhem Bichot authored
    (need a mutex when modifying bitmap->non_flushable), which I hit when running maria_bulk_insert.yy.
    After fixing this, I hit an assertion in check_and_set_lsn() saying that the page was PAGECACHE_PLAIN_PAGE.
    This could be caused by pages left by an operation which had transactions disabled (like a bulk insert with repair):
    in this patch we remove those pages out of the cache when we re-enable transactions.
    After fixing this, I get page cache deadlocks, pushbuild2 also has some, to be looked at.
    No testcase, requires concurrency and running for 15 minutes, but automatically tested by pushbuild2.
    
    
    storage/maria/ma_bitmap.c:
      Doing bitmap->non_flushable++ without mutex was wrong. If this ++ happened while another ++ or -- was happening
      in another thread, one ++ or -- could be missed and the bitmap code would behave wrongly. For example, if a ++
      was missed, the DBUG_ASSERT(((int) (bitmap->non_flushable)) >= 0) in _ma_bitmap_release_unused() could fire.
      I saw this assertion happen in practice in maria_bulk_insert.yy. Adding this mutex lock eliminated
      the assertion problem.
      The >=0 was wrong, should be >0 (or the variable could go negative).
    storage/maria/ma_recovery.c:
      When we re-enable transactionality, as we may have created pages of type PAGECACHE_PLAIN_PAGE before,
      we need to remove them from the cache (FLUSH_RELEASE). Or they would stay this way, and later when we
      maria_write() to them, we would try to tag them with a LSN (ma_unpin_all_pages()), which is incorrect
      for a plain page (and causes assertion in the page cache at start of check_and_set_lsn()).
      I saw the assertion fire with maria_bulk_insert.yy, and this seems to cure it.
      page cache
    c9bb8999
ma_recovery.c 112 KB