1. 17 Oct, 2023 11 commits
    • Martin K. Petersen's avatar
      Merge patch series "scsi: scsi_debug: Add error injection for single device" · 058676b5
      Martin K. Petersen authored
      Wenchao Hao <haowenchao2@huawei.com> says:
      
      The original error injection mechanism was based on scsi_host which
      could not inject fault for a single SCSI device.
      
      This patchset provides the ability to inject errors for a single SCSI
      device. Now we support inject timeout errors, queuecommand errors, and
      hostbyte, driverbyte, statusbyte, and sense data for specific SCSI
      Command. Two new error injection is defined to make abort command or
      reset LUN failed.
      
      Besides error injection for single device, this patchset add a new
      interface to make reset target failed for each scsi_target.
      
      The first two patch add a debugfs interface to add and inquiry single
      device's error injection info; the third patch defined how to remove
      an injection which has been added. The following 5 patches use the
      injection info and generate the related error type. The last two just
      add a new interface to make reset target failed and control
      scsi_device's allow_restart flag.
      
      Link: https://lore.kernel.org/r/20231010092051.608007-1-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      058676b5
    • Wenchao Hao's avatar
      scsi: scsi_debug: Add param to control sdev's allow_restart · 573c2d06
      Wenchao Hao authored
      Add new module param "allow_restart" to control scsi_device's allow_restart
      flag. This flag determines if EH is triggered after a command completes
      with sense_key 0x6, ASC 0x4 and ASCQ 0x2. EH would be triggered if
      allow_restart=1 in this condition.
      
      The new param can be used with the error injection capability to test how
      commands completing with sense_key 0x6, ASC 0x4 and ASCQ 0x2 are handled.
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-11-haowenchao2@huawei.comTested-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      573c2d06
    • Wenchao Hao's avatar
      scsi: scsi_debug: Add debugfs interface to fail target reset · f084fe52
      Wenchao Hao authored
      The interface is found at
      /sys/kernel/debug/scsi_debug/target<h:c:t>/fail_reset where <h:c:t>
      identifies the target to inject errors on. It's a simple bool type
      interface which would make this target's reset fail if set to 'Y'.
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-10-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      f084fe52
    • Wenchao Hao's avatar
      scsi: scsi_debug: Add new error injection type: Reset LUN failed · 02678116
      Wenchao Hao authored
      Add error injection type 4 to make scsi_debug_device_reset() return FAILED.
      Fail abort command format:
      
        +--------+------+-------------------------------------------------------+
        | Column | Type | Description                                           |
        +--------+------+-------------------------------------------------------+
        |   1    |  u8  | Error type, fixed to 0x4                              |
        +--------+------+-------------------------------------------------------+
        |   2    |  s32 | Error count                                           |
        |        |      |  0: this rule will be ignored                         |
        |        |      |  positive: the rule will always take effect           |
        |        |      |  negative: the rule takes effect n times where -n is  |
        |        |      |            the value given. Ignored after n times     |
        +--------+------+-------------------------------------------------------+
        |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
        +--------+------+-------------------------------------------------------+
      
      Examples:
      
          error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
          echo "4 -10 0x12" > ${error}
      
      will make the device return FAILED when trying to reset LUN with inquiry
      command 10 times.
      
          error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
          echo "4 -10 0xff" > ${error}
      
      will make the device return FAILED when trying to reset LUN 10 times.
      
      Usually we do not care about what command it is when trying to perform
      reset LUN, so 0xff could be applied.
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-9-haowenchao2@huawei.comTested-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      02678116
    • Wenchao Hao's avatar
      scsi: scsi_debug: Add new error injection type: Abort Failed · 5551ce92
      Wenchao Hao authored
      Add error injection type 3 to make scsi_debug_abort() return FAILED.  Fail
      abort command format:
      
        +--------+------+-------------------------------------------------------+
        | Column | Type | Description                                           |
        +--------+------+-------------------------------------------------------+
        |   1    |  u8  | Error type, fixed to 0x3                              |
        +--------+------+-------------------------------------------------------+
        |   2    |  s32 | Error count                                           |
        |        |      |  0: this rule will be ignored                         |
        |        |      |  positive: the rule will always take effect           |
        |        |      |  negative: the rule takes effect n times where -n is  |
        |        |      |            the value given. Ignored after n times     |
        +--------+------+-------------------------------------------------------+
        |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
        +--------+------+-------------------------------------------------------+
      
      Examples:
      
          error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
          echo "3 -10 0x12" > ${error}
      
      will make the device return FAILED when aborting inquiry command 10 times.
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-8-haowenchao2@huawei.comTested-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      5551ce92
    • Wenchao Hao's avatar
      scsi: scsi_debug: Set command result and sense data if error is injected · 33592274
      Wenchao Hao authored
      If a fail command error is injected, set the command's status and sense
      data then finish this SCSI command.
      
      Set SCSI command's status and sense data format:
      
        +--------+------+-------------------------------------------------------+
        | Column | Type | Description                                           |
        +--------+------+-------------------------------------------------------+
        |   1    |  u8  | Error type, fixed to 0x2                              |
        +--------+------+-------------------------------------------------------+
        |   2    |  s32 | Error Count                                           |
        |        |      |  0: the rule will be ignored                          |
        |        |      |  positive: the rule will always take effect           |
        |        |      |  negative: the rule takes effect n times where -n is  |
        |        |      |            the value given. Ignored after n times     |
        +--------+------+-------------------------------------------------------+
        |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
        +--------+------+-------------------------------------------------------+
        |   4    |  x8  | Host byte in scsi_cmd::status                         |
        |        |      | [scsi_cmd::status has 32 bits holding these 3 bytes]  |
        +--------+------+-------------------------------------------------------+
        |   5    |  x8  | Driver byte in scsi_cmd::status                       |
        +--------+------+-------------------------------------------------------+
        |   6    |  x8  | SCSI Status byte in scsi_cmd::status                  |
        +--------+------+-------------------------------------------------------+
        |   7    |  x8  | SCSI Sense Key in scsi_cmnd                           |
        +--------+------+-------------------------------------------------------+
        |   8    |  x8  | SCSI ASC in scsi_cmnd                                 |
        +--------+------+-------------------------------------------------------+
        |   9    |  x8  | SCSI ASCQ in scsi_cmnd                                |
        +--------+------+-------------------------------------------------------+
      
      Examples:
          error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
          echo "2 -10 0x88 0 0 0x2 0x3 0x11 0x0" >${error}
      
      will make device's read command return with media error with additional
      sense of "Unrecovered read error" (UNC):
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-7-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      33592274
    • Wenchao Hao's avatar
      scsi: scsi_debug: Return failed value if error is injected · 33bccf55
      Wenchao Hao authored
      If a fail queuecommand error is injected, return the failed value defined
      in the rule from queuecommand.
      
      Make queuecommand return format:
      
        +--------+------+-------------------------------------------------------+
        | Column | Type | Description                                           |
        +--------+------+-------------------------------------------------------+
        |   1    |  u8  | Error type, fixed to 0x1                              |
        +--------+------+-------------------------------------------------------+
        |   2    |  s32 | Error count                                           |
        |        |      |  0: this rule will be ignored                         |
        |        |      |  positive: the rule will always take effect           |
        |        |      |  negative: the rule takes effect n times where -n is  |
        |        |      |            the value given. Ignored after n times     |
        +--------+------+-------------------------------------------------------+
        |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
        +--------+------+-------------------------------------------------------+
        |   4    |  x32 | The queuecommand() return value we want               |
        +--------+------+-------------------------------------------------------+
      
      Examples:
      
          error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
          echo "1 1 0x12 0x1055" > ${error}
      
      will make each INQUIRY command sent to that device return 0x1055
      (SCSI_MLQUEUE_HOST_BUSY).
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-6-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      33bccf55
    • Wenchao Hao's avatar
      scsi: scsi_debug: Time out command if the error is injected · 32be8b6e
      Wenchao Hao authored
      If a timeout error is injected, return 0 from scsi_debug_queuecommand to
      make the command time out.
      
      Time out SCSI command format:
      
        +--------+------+-------------------------------------------------------+
        | Column | Type | Description                                           |
        +--------+------+-------------------------------------------------------+
        |   1    |  u8  | Error type, fixed to 0x0                              |
        +--------+------+-------------------------------------------------------+
        |   2    |  s32 | Error count                                           |
        |        |      |  0: this rule will be ignored                         |
        |        |      |  positive: the rule will always take effect           |
        |        |      |  negative: the rule takes effect n times where -n is  |
        |        |      |            the value given. Ignored after n times     |
        +--------+------+-------------------------------------------------------+
        |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
        +--------+------+-------------------------------------------------------+
      
      Examples:
      
          error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
          echo "0 -10 0x12" > ${error}
      
      will make the device's inquiry command time out 10 times.
      
          echo "0 1 0x12" > ${error}
      
      will make the device's inquiry time out each time it is invoked on this
      device.
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      32be8b6e
    • Wenchao Hao's avatar
      scsi: scsi_debug: Define grammar to remove added error injection · 962d77cd
      Wenchao Hao authored
      The grammar to remove error injection is a line with fixed 3 columns
      separated by spaces.
      
      First column is fixed to "-". It tells this is a removal operation.  Second
      column is the error code to match.  Third column is the scsi command to
      match.
      
      For example the following command would remove timeout injection of inquiry
      command:
      
          echo "- 0 0x12" > /sys/kernel/debug/scsi_debug/0:0:0:1/error
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-4-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      962d77cd
    • Wenchao Hao's avatar
      scsi: scsi_debug: Add interface to manage error injection for a single device · a9996d72
      Wenchao Hao authored
      This new facility uses the debugfs pseudo file system which is typically
      mounted under the /sys/kernel/debug directory and requires root permissions
      to access.
      
      The interface file is found at /sys/kernel/debug/scsi_debug/<h:c:t:l>/error
      where <h:c:t:l> identifies the device (logical unit (LU)) to inject errors
      on.
      
      For the following description the ${error} environment variable is assumed
      to be set to/sys/kernel/debug/scsi_debug/1:0:0:0/error where 1:0:0:0 is a
      pseudo device (LU) owned by the scsi_debug driver. Rules are written to
      ${error} in the normal sysfs fashion (e.g. 'echo "0 -2 0x12" > ${error}').
      
      More than one rule can be active on a device at a time and inactive rules
      (i.e. those whose error count is 0) remain in the rule listing. The
      existing rules can be read with 'cat ${error}' with oneline output for each
      rule.
      
      The interface format is line-by-line, each line is an error injection rule.
      Each rule contains integers separated by spaces, the first three columns
      correspond to "Error code", "Error count" and "SCSI command", other
      columns depend on Error code.
      
      General rule format:
        +--------+------+-------------------------------------------------------+
        | Column | Type | Description                                           |
        +--------+------+-------------------------------------------------------+
        |   1    |  u8  | Error code                                            |
        |        |      |  0: timeout SCSI command                              |
        |        |      |  1: fail queuecommand, make queuecommand return       |
        |        |      |     given value                                       |
        |        |      |  2: fail command, finish command with SCSI status,    |
        |        |      |     sense key and ASC/ASCQ values                     |
        |        |      |  3: make abort commands for specific command fail     |
        |        |      |  4: make reset lun for specific command fail          |
        +--------+------+-------------------------------------------------------+
        |   2    |  s32 | Error count                                           |
        |        |      |  0: this rule will be ignored                         |
        |        |      |  positive: the rule will always take effect           |
        |        |      |  negative: the rule takes effect n times where -n is  |
        |        |      |            the value given. Ignored after n times     |
        +--------+------+-------------------------------------------------------+
        |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
        +--------+------+-------------------------------------------------------+
        |  ...   |  xxx | Error type specific fields                            |
        +--------+------+-------------------------------------------------------+
      
      Notes:
      
       - When multiple error inject rules are added for the same SCSI command,
         the one with smaller error code will take effect (and the others will be
         ignored).
      
       - If the same error (i.e. same Error code and SCSI command) is added, the
         older one will be overwritten..
      
       - Currently, the basic types are (u8/u16/u32/u64/s8/s16/s32/s64) and the
         hexadecimal types (x8/x16/x32/x64).
      
       - Where a hexadecimal value is expected (e.g. Column 3: SCSI command
         opcode) the "0x" prefix is optional on the value (e.g. the INQUIRY
         opcode can be given as '0x12' or '12').
      
       - When the Error count is negative, reading ${error} will show that value
         incrementing, stopping when it gets to 0.
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-3-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      a9996d72
    • Wenchao Hao's avatar
      scsi: scsi_debug: Create scsi_debug directory in the debugfs filesystem · 6e2d15f5
      Wenchao Hao authored
      Create directory scsi_debug in the root of the debugfs filesystem.  Prepare
      to add interface for manage error injection.
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarWenchao Hao <haowenchao2@huawei.com>
      Link: https://lore.kernel.org/r/20231010092051.608007-2-haowenchao2@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      6e2d15f5
  2. 13 Oct, 2023 29 commits