ERROR_codes.txt 17.7 KB
Newer Older
1
Next QMGR 1
2
Next NDBCNTR 1001
3
Next NDBFS 2000
4
Next DBACC 3002
5
Next DBTUP 4029
jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
6
Next DBLQH 5045
pekka@mysql.com's avatar
pekka@mysql.com committed
7
Next DBDICT 6007
jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
8
Next DBDIH 7181
jonas@perch.ndb.mysql.com's avatar
ndb -  
jonas@perch.ndb.mysql.com committed
9
Next DBTC 8039
10
Next CMVMI 9000
jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
11
Next BACKUP 10038
12
Next DBUTIL 11002
pekka@mysql.com's avatar
pekka@mysql.com committed
13
Next DBTUX 12008
14 15 16 17 18 19 20 21 22 23
Next SUMA 13001

TESTING NODE FAILURE, ARBITRATION
---------------------------------

911 - 919:
Crash president when he starts to run in ArbitState 1-9.

910: Crash new president after node crash

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
24 25
934 : Crash president in ALLOC_NODE_ID_REQ

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
26 27
935 : Crash master on node failure (delayed) 
      and skip sending GSN_COMMIT_FAILREQ to specified node
jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
28

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
ERROR CODES FOR TESTING NODE FAILURE, GLOBAL CHECKPOINT HANDLING:
-----------------------------------------------------------------

7000:
Insert system error in master when global checkpoint is idle.

7001:
Insert system error in master after receiving GCP_PREPARE from
all nodes in the cluster.

7002:
Insert system error in master after receiving GCP_NODEFINISH from
all nodes in the cluster.

7003:
Insert system error in master after receiving GCP_SAVECONF from
all nodes in the cluster.

7004:
Insert system error in master after completing global checkpoint with
all nodes in the cluster.

7005:
Insert system error in GCP participant when receiving GCP_PREPARE.

7006:
Insert system error in GCP participant when receiving GCP_COMMIT.

7007:
Insert system error in GCP participant when receiving GCP_TCFINISHED.

7008:
Insert system error in GCP participant when receiving COPY_GCICONF.

5000:
Insert system error in GCP participant when receiving GCP_SAVEREQ.

5007:
Delay GCP_SAVEREQ by 10 secs

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
69 70
7165: Delay INCL_NODE_REQ in starting node yeilding error in GCP_PREPARE

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
71 72 73
7030: Delay in GCP_PREPARE until node has completed a node failure
7031: Delay in GCP_PREPARE and die 3s later

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
74 75
7177: Delay copying of sysfileData in execCOPY_GCIREQ

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
76 77
7180: Crash master during master-take-over in execMASTER_LCPCONF

78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
ERROR CODES FOR TESTING NODE FAILURE, LOCAL CHECKPOINT HANDLING:
-----------------------------------------------------------------

7009:
Insert system error in master when local checkpoint is idle.

7010:
Insert system error in master when local checkpoint is in the
state clcpStatus = CALCULATE_KEEP_GCI.

7011:
Stop local checkpoint in the state CALCULATE_KEEP_GCI.

7012:
Restart local checkpoint after stopping in CALCULATE_KEEP_GCI.

Method:
1) Error 7011 in master, wait until report of stopped.
2) Error xxxx in participant to crash it.
3) Error 7012 in master to start again.

7013:
Insert system error in master when local checkpoint is in the
state clcpStatus = COPY_GCI before sending COPY_GCIREQ.

7014:
Insert system error in master when local checkpoint is in the
state clcpStatus = TC_CLOPSIZE before sending TC_CLOPSIZEREQ.

7015:
Insert system error in master when local checkpoint is in the
state clcpStatus = START_LCP_ROUND before sending START_LCP_ROUND.

7016:
Insert system error in master when local checkpoint is in the
state clcpStatus = START_LCP_ROUND after receiving LCP_REPORT.

7017:
Insert system error in master when local checkpoint is in the
state clcpStatus = TAB_COMPLETED.

7018:
Insert system error in master when local checkpoint is in the
state clcpStatus = TAB_SAVED before sending DIH_LCPCOMPLETE.

7019:
Insert system error in master when local checkpoint is in the
state clcpStatus = IDLE before sending CONTINUEB(ZCHECK_TC_COUNTER).

7020:
Insert system error in local checkpoint participant at reception of
COPY_GCIREQ.

7075: Master
Don't send any LCP_FRAG_ORD(last=true)
And crash when all have "not" been sent

8000: Crash particpant when receiving TCGETOPSIZEREQ
8001: Crash particpant when receiving TC_CLOPSIZEREQ
5010: Crash any when receiving LCP_FRAGORD

7021: Crash in  master when receiving START_LCP_REQ
7022: Crash in !master when receiving START_LCP_REQ

7023: Crash in  master when sending START_LCP_CONF
7024: Crash in !master when sending START_LCP_CONF

7025: Crash in  master when receiving LCP_FRAG_REP
7016: Crash in !master when receiving LCP_FRAG_REP

7026: Crash in  master when changing state to LCP_TAB_COMPLETED 
7017: Crash in !master when changing state to LCP_TAB_COMPLETED 

7027: Crash in  master when changing state to LCP_TAB_SAVED
7018: Crash in  master when changing state to LCP_TAB_SAVED

ERROR CODES FOR TESTING NODE FAILURE, FAILURE IN COPY FRAGMENT PROCESS:
-----------------------------------------------------------------------

5002:
Insert node failure in starting node when receiving a tuple copied from the copy node
as part of copy fragment process.
5003:
Insert node failure when receiving ABORT signal.

5004:
Insert node failure handling when receiving COMMITREQ.

5005:
Insert node failure handling when receiving COMPLETEREQ.

5006:
Insert node failure handling when receiving ABORTREQ.

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
172 173 174
5042:
As 5002, but with specified table (see DumpStateOrd)

175 176 177 178 179
These error code can be combined with error codes for testing time-out
handling in DBTC to ensure that node failures are also well handled in
time-out handling. They can also be used to test multiple node failure
handling.

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
180

181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206
ERROR CODES FOR TESTING TIME-OUT HANDLING IN DBLQH
-------------------------------------------------
5011:
Delay execution of COMMIT signal 2 seconds to generate time-out.

5012 (use 5017):
First delay execution of COMMIT signal 2 seconds to generate COMMITREQ.
Delay execution of COMMITREQ signal 2 seconds to generate time-out.

5013:
Delay execution of COMPLETE signal 2 seconds to generate time-out.

5014 (use 5018):
First delay execution of COMPLETE signal 2 seconds to generate COMPLETEREQ.
Delay execution of COMPLETEREQ signal 2 seconds to generate time-out.

5015:
Delay execution of ABORT signal 2 seconds to generate time-out.

5016: (ABORTREQ only as part of take-over)
Delay execution of ABORTREQ signal 2 seconds to generate time-out.

5031: lqhKeyRef, ZNO_TC_CONNECT_ERROR
5032: lqhKeyRef, ZTEMPORARY_REDO_LOG_FAILURE
5033: lqhKeyRef, ZTAIL_PROBLEM_IN_LOG_ERROR

207 208 209 210
5034: Don't pop scan queue

5035: Delay ACC_CONTOPCONT

211 212 213
5038: Drop LQHKEYREQ + set 5039
5039: Drop ABORT + set 5003

joreland@mysql.com's avatar
joreland@mysql.com committed
214 215
8048: Make TC not choose own node for simple/dirty read
5041: Crash is receiving simple read from other TC on different node
216

217 218
8050: Send TCKEYREF is operation is non local

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
219 220 221
5100,5101: Drop ABORT req in primary replica
           Crash on "next" ABORT

222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242
ERROR CODES FOR TESTING TIME-OUT HANDLING IN DBTC
-------------------------------------------------
8040:
Delay execution of ABORTED signal 2 seconds to generate time-out.

8041:
Delay execution of COMMITTED signal 2 seconds to generate time-out.
8042 (use 8046):
Delay execution of COMMITTED signal 2 seconds to generate COMMITCONF.
Delay execution of COMMITCONF signal 2 seconds to generate time-out.

8043:
Delay execution of COMPLETED signal 2 seconds to generate time-out.

8044 (use 8047):
Delay execution of COMPLETED signal 2 seconds to generate COMPLETECONF.
Delay execution of COMPLETECONF signal 2 seconds to generate time-out.

8045: (ABORTCONF only as part of take-over)
Delay execution of ABORTCONF signal 2 seconds to generate time-out.

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
243 244
8050: Send ZABORT_TIMEOUT_BREAK delayed

245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294
ERROR CODES FOR TESTING TIME-OUT HANDLING IN DBTC
-------------------------------------------------

8003: Throw away a LQHKEYCONF in state STARTED
8004: Throw away a LQHKEYCONF in state RECEIVING
8005: Throw away a LQHKEYCONF in state REC_COMMITTING
8006: Throw away a LQHKEYCONF in state START_COMMITTING

8007: Ignore send of LQHKEYREQ in state STARTED
8008: Ignore send of LQHKEYREQ in state START_COMMITTING

8009: Ignore send of LQHKEYREQ+ATTRINFO in state STARTED
8010: Ignore send of LQHKEYREQ+ATTRINFO in state START_COMMITTING

8011: Abort at send of CONTINUEB(ZSEND_ATTRINFO) in state STARTED
8012: Abort at send of CONTINUEB(ZSEND_ATTRINFO) in state START_COMMITTING

8013: Ignore send of CONTINUEB(ZSEND_COMPLETE_LOOP) (should crash eventually)
8014: Ignore send of CONTINUEB(ZSEND_COMMIT_LOOP) (should crash eventually)

8015: Ignore ATTRINFO signal in DBTC in state REC_COMMITTING
8016: Ignore ATTRINFO signal in DBTC in state RECEIVING

8017: Return immediately from DIVERIFYCONF (should crash eventually)
8018: Throw away a COMMITTED signal
8019: Throw away a COMPLETED signal

TESTING TAKE-OVER FUNCTIONALITY IN DBTC
---------------------------------------

8002: Crash when sending LQHKEYREQ
8029: Crash when receiving LQHKEYCONF
8030: Crash when receiving COMMITTED
8031: Crash when receiving COMPLETED
8020: Crash when all COMMITTED has arrived
8021: Crash when all COMPLETED has arrived
8022: Crash when all LQHKEYCONF has arrived

COMBINATION OF TIME-OUT + CRASH
-------------------------------

8023 (use 8024): Ignore LQHKEYCONF and crash when ABORTED signal arrives by setting 8024
8025 (use 8026): Ignore COMMITTED and crash when COMMITCONF signal arrives by setting 8026
8027 (use 8028): Ignore COMPLETED and crash when COMPLETECONF signal arrives by setting 8028

ABORT OF TCKEYREQ
-----------------

8032: No free TC records any more

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
295
8037 : Invalid schema version in TCINDXREQ
296

jonas@perch.ndb.mysql.com's avatar
ndb -  
jonas@perch.ndb.mysql.com committed
297 298 299 300 301
------

8038 : Simulate API disconnect just after SCAN_TAB_REQ


302 303 304
CMVMI
-----
9000 Set RestartOnErrorInsert to restart -n
305
9998 Enter endless loop (trigger watchdog)
306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327
9999 Crash system immediatly

Test Crashes in handling node restarts
--------------------------------------

7121: Crash after receiving permission to start (START_PERMCONF) in starting
      node.
7122: Crash master when receiving request for permission to start (START_PERMREQ).
7123: Crash any non-starting node when receiving information about a starting node
      (START_INFOREQ)
7124: Respond negatively on an info request (START_INFOREQ)
7125: Stop an invalidate Node LCP process in the middle to test if START_INFOREQ
      stopped by long-running processes are handled in a correct manner.
7126: Allow node restarts for all nodes (used in conjunction with 7025)
7127: Crash when receiving a INCL_NODEREQ message.
7128: Crash master after receiving all INCL_NODECONF from all nodes
7129: Crash master after receiving all INCL_NODECONF from all nodes and releasing
      the lock on the dictionary
7130: Crash starting node after receiving START_MECONF
7131: Crash when receiving START_COPYREQ in master node
7132: Crash when receiving START_COPYCONF in starting node

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
328 329
7170: Crash when receiving START_PERMREF (InitialStartRequired)

330 331 332
7174: Crash starting node before sending DICT_LOCK_REQ
7175: Master sends one fake START_PERMREF (ZNODE_ALREADY_STARTING_ERROR)
7176: Slave NR pretends master does not support DICT lock (rolling upgrade)
333

334 335 336 337 338 339 340 341 342
DICT:
6000  Crash during NR when receiving DICTSTARTREQ
6001  Crash during NR when receiving SCHEMA_INFO
6002  Crash during NR soon after sending GET_TABINFO_REQ

LQH:
5026  Crash when receiving COPY_ACTIVEREQ
5027  Crash when receiving STAT_RECREQ

jonas@perch.ndb.mysql.com's avatar
ndb -  
jonas@perch.ndb.mysql.com committed
343
5043  Crash starting node, when scan is finished on primary replica
344

345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425
Test Crashes in handling take over
----------------------------------

7133: Crash when receiving START_TOREQ
7134: Crash master after receiving all START_TOCONF
7135: Crash master after copying table 0 to starting node
7136: Crash master after completing copy of tables
7137: Crash master after adding a fragment before copying it
7138: Crash when receiving CREATE_FRAGREQ in prepare phase
7139: Crash when receiving CREATE_FRAGREQ in commit phase
7140: Crash master when receiving all CREATE_FRAGCONF in prepare phase
7141: Crash master when receiving all CREATE_FRAGCONF in commit phase
7142: Crash master when receiving COPY_FRAGCONF
7143: Crash master when receiving COPY_ACTIVECONF
7144: Crash when receiving END_TOREQ
7145: Crash master after receiving first END_TOCONF
7146: Crash master after receiving all END_TOCONF
7147: Crash master after receiving first START_TOCONF
7148: Crash master after receiving first CREATE_FRAGCONF
7152: Crash master after receiving first UPDATE_TOCONF
7153: Crash master after receiving all UPDATE_TOCONF
7154: Crash when receiving UPDATE_TOREQ
7155: Crash master when completing writing start take over info
7156: Crash master when completing writing end take over info

Test failures in various states in take over functionality
----------------------------------------------------------
7157: Block take over at start take over
7158: Block take over at sending of START_TOREQ
7159: Block take over at selecting next fragment
7160: Block take over at creating new fragment
7161: Block take over at sending of CREATE_FRAGREQ in prepare phase
7162: Block take over at sending of CREATE_FRAGREQ in commit phase
7163: Block take over at sending of UPDATE_TOREQ at end of copy frag
7164: Block take over at sending of END_TOREQ
7169: Block take over at sending of UPDATE_TOREQ at end of copy

5008: Crash at reception of EMPTY_LCPREQ (at master take over after NF)
5009: Crash at sending of EMPTY_LCPCONF (at master take over after NF)

Test Crashes in Handling Graceful Shutdown
------------------------------------------
7065: Crash when receiving STOP_PERMREQ in master
7066: Crash when receiving STOP_PERMREQ in slave
7067: Crash when receiving DIH_SWITCH_REPLICA_REQ
7068: Crash when receiving DIH_SWITCH_REPLICA_CONF


Backup Stuff:
------------------------------------------
10001: Crash on NODE_FAILREP in Backup coordinator
10002: Crash on NODE_FAILREP when coordinatorTakeOver
10003: Crash on PREP_CREATE_TRIG_{CONF/REF} (only coordinator)
10004: Crash on START_BACKUP_{CONF/REF} (only coordinator)
10005: Crash on CREATE_TRIG_{CONF/REF} (only coordinator)
10006: Crash on WAIT_GCP_REF (only coordinator)
10007: Crash on WAIT_GCP_CONF (only coordinator)
10008: Crash on WAIT_GCP_CONF during start of backup (only coordinator)
10009: Crash on WAIT_GCP_CONF during stop of backup (only coordinator)
10010: Crash on BACKUP_FRAGMENT_CONF (only coordinator)
10011: Crash on BACKUP_FRAGMENT_REF (only coordinator)
10012: Crash on DROP_TRIG_{CONF/REF} (only coordinator)
10013: Crash on STOP_BACKUP_{CONF/REF} (only coordinator)
10014: Crash on DEFINE_BACKUP_REQ (participant)
10015: Crash on START_BACKUP_REQ (participant)
10016: Crash on BACKUP_FRAGMENT_REQ (participant)
10017: Crash on SCAN_FRAGCONF (participant)
10018: Crash on FSAPPENDCONF (participant)
10019: Crash on TRIG_ATTRINFO (participant)
10020: Crash on STOP_BACKUP_REQ (participant)
10021: Crash on NODE_FAILREP in participant not becoming coordinator

10022: Fake no backup records at DEFINE_BACKUP_REQ (participant)
10023: Abort backup by error at reception of UTIL_SEQUENCE_CONF (code 300)
10024: Abort backup by error at reception of DEFINE_BACKUP_CONF (code 301)
10025: Abort backup by error at reception of CREATE_TRIG_CONF last (code 302)
10026: Abort backup by error at reception of START_BACKUP_CONF (code 303)
10027: Abort backup by error at reception of DEFINE_BACKUP_REQ at master (code 304)
10028: Abort backup by error at reception of BACKUP_FRAGMENT_CONF at master (code 305)
10029: Abort backup by error at reception of FSAPPENDCONF in slave (FileOrScanError = 5)
10030: Simulate buffer full from trigger execution => abort backup
426 427 428 429 430
10031: Error 331 for dictCommitTableMutex_locked
10032: backup checkscan
10033: backup checkscan
10034: define backup reply error
10035: Fail to allocate buffers
431

jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
432 433 434
10036: Halt backup for table >= 2
10037: Resume backup (from 10036)

435 436 437 438
11001: Send UTIL_SEQUENCE_REF (in master)

5028:  Crash when receiving LQHKEYREQ (in non-master)

439 440 441 442
Failed Create Table:
--------------------
7173: Create table failed due to not sufficient number of fragment or
      replica records.
443
3001: Fail create 1st fragment
444 445 446 447 448 449
4007 12001: Fail create 1st fragment
4008 12002: Fail create 2nd fragment
4009 12003: Fail create 1st attribute in 1st fragment
4010 12004: Fail create last attribute in 1st fragment
4011 12005: Fail create 1st attribute in 2nd fragment
4012 12006: Fail create last attribute in 2nd fragment
450

451 452 453 454 455
Drop Table/Index:
-----------------
4001: Crash on REL_TABMEMREQ in TUP
4002: Crash on DROP_TABFILEREQ in TUP
4003: Fail next trigger create in TUP
456
4004: Fail next trigger drop in TUP
457 458
8033: Fail next trigger create in TC
8034: Fail next index create in TC
459 460
8035: Fail next trigger drop in TC
8036: Fail next index drop in TC
pekka@mysql.com's avatar
pekka@mysql.com committed
461
6006: Crash participant in create index
462

463 464
4013: verify TUP tab descr before and after next DROP TABLE

465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494
System Restart:
---------------

5020: Force system to read pages form file when executing prepare operation record
3000: Delay writing of datapages in ACC when LCP is started
4000: Delay writing of datapages in TUP when LCP is started
7070: Set TimeBetweenLcp to min value
7071: Set TimeBetweenLcp to max value
7072: Split START_FRAGREQ into several log nodes
7073: Don't include own node in START_FRAGREQ
7074: 7072 + 7073

Scan:
------

5021: Crash when receiving SCAN_NEXTREQ if sender is own node
5022: Crash when receiving SCAN_NEXTREQ if sender is NOT own node
5023: Drop SCAN_NEXTREQ if sender is own node
5024: Drop SCAN_NEXTREQ if sender is NOT own node
5025: Delay SCAN_NEXTREQ 1 second if sender is NOT own node
5030: Drop all SCAN_NEXTREQ until node is shutdown with SYSTEM_ERROR
      because of scan fragment timeout

Test routing of signals:
-----------------------
4006: Turn on routing of TRANSID_AI signals from TUP
5029: Turn on routing of KEYINFO20 signals from LQH

Ordered index:
--------------
pekka@mysql.com's avatar
pekka@mysql.com committed
495
12007: Make next alloc node fail with no memory error
496 497 498

Dbdict:
-------
499 500 501
6003 Crash in participant @ CreateTabReq::Prepare
6004 Crash in participant @ CreateTabReq::Commit
6005 Crash in participant @ CreateTabReq::CreateDrop
502 503 504 505 506 507 508 509 510 511

Dbtup:
4014 - handleInsert - Out of undo buffer
4015 - handleInsert - Out of log space
4016 - handleInsert - AI Inconsistency
4017 - handleInsert - Out of memory
4018 - handleInsert - Null check error
4019 - handleInsert - Alloc rowid error
4020 - handleInsert - Size change error
4021 - handleInsert - Out of disk space
512 513 514

4022 - addTuxEntries - fail before add of first entry
4023 - addTuxEntries - fail add of last entry (the entry for last index)
jonas@perch.ndb.mysql.com's avatar
jonas@perch.ndb.mysql.com committed
515 516 517 518 519 520 521 522 523

4025: Fail all inserts with out of memory
4026: Fail one insert with oom
4027: Fail inserts randomly with oom
4028: Fail one random insert with oom

NDBCNTR:

1000: Crash insertion on SystemError::CopyFragRef