1. 29 Sep, 2023 13 commits
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.6-rc4' of https://github.com/ceph/ceph-client · 14c06b91
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A series that fixes an involved 'double watch error' deadlock in RBD
        marked for stable and two cleanups"
      
      * tag 'ceph-for-6.6-rc4' of https://github.com/ceph/ceph-client:
        rbd: take header_rwsem in rbd_dev_refresh() only when updating
        rbd: decouple parent info read-in from updating rbd_dev
        rbd: decouple header read-in from updating rbd_dev->header
        rbd: move rbd_dev_refresh() definition
        Revert "ceph: make members in struct ceph_mds_request_args_ext a union"
        ceph: remove unnecessary check for NULL in parse_longname()
      14c06b91
    • Linus Torvalds's avatar
      Merge tag 'xfs-6.6-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 10c0b6ba
      Linus Torvalds authored
      Pull xfs fix from Chandan Babu:
      
       - fix for commit 68b957f6 ("xfs: load uncached unlinked inodes into
         memory on demand") which address review comments provided by Dave
         Chinner
      
      * tag 'xfs-6.6-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: fix reloading entire unlinked bucket lists
      10c0b6ba
    • Linus Torvalds's avatar
      Merge tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · 95289e49
      Linus Torvalds authored
      Pull ATA fixes from Damien Le Moal:
       "A larger than usual set of fixes for 6.6-rc4 due to the unexpected
        number of fixes needed to address ATA disks suspend/resume issues.
      
        In more detail:
      
         - Add missing additionalProperties on child nodes to the pata-common
           DT bindings (Rob)
      
         - Fix handling of the REPORT SUPPORTED OPERATION CODES command to
           ignore reserved bits (Niklas)
      
         - Increase port multiplier soft reset timeout to accomodate slow
           devices and avoid issues on wakeup (Matthias)
      
         - A couple of minor code fixes to avoid compilation warnings in
           libata-core and libata-eh (me)
      
         - Many patches from me to address suspend/resume issues, and in
           particular a potential deadlock on resume due to the SCSI disk
           driver resume operation not being synchronized with libata EH port
           resume handling.
      
           This is addressed by changing the scsi disk driver disk start/stop
           control to allow libata to execute disk suspend (spin down) and
           resume (spin up) on its own during system suspend/resume. Runtime
           suspend/resume control remains with the SCSI disk driver.
      
           Other fixes include:
            - Fix libata power management request issuing to avoid races
            - Establish a link between ATA ports and SCSI devices to order PM
              operations
            - Fix device removal to avoid issues with driver rmmod removal
            - Fix synchronization of libata device rescan and SCSI disk resume
              operation
            - Remove libsas PM operations as suspend/resume is handled
              directly by the sas controller resume
            - Fix the SCSI disk driver to not issue commands to suspended
              disks, thus avoiding potential system lock-up on resume"
      
      * tag 'ata-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
        ata: libata-eh: Fix compilation warning in ata_eh_link_report()
        ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
        scsi: sd: Do not issue commands to suspended disks on shutdown
        ata: libata-core: Do not register PM operations for SAS ports
        ata: libata-scsi: Fix delayed scsi_rescan_device() execution
        scsi: Do not attempt to rescan suspended devices
        ata: libata-scsi: Disable scsi device manage_system_start_stop
        scsi: sd: Differentiate system and runtime start/stop management
        ata: libata-scsi: link ata port and scsi device
        ata: libata-core: Fix port and device removal
        ata: libata-core: Fix ata_port_request_pm() locking
        ata: libata-sata: increase PMP SRST timeout to 10s
        ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
        dt-bindings: ata: pata-common: Add missing additionalProperties on child nodes
      95289e49
    • Linus Torvalds's avatar
      Merge tag 'block-6.6-2023-09-28' of git://git.kernel.dk/linux · eafdc507
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Just two minor comment / documentation fixes for the block side"
      
      * tag 'block-6.6-2023-09-28' of git://git.kernel.dk/linux:
        block: fix kernel-doc for disk_force_media_change()
        block: correct stale comment in rq_qos_wait
      eafdc507
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.6-2023-09-28' of git://git.kernel.dk/linux · a98b9595
      Linus Torvalds authored
      Pull io_uring fix from Jens Axboe:
       "A single fix going to stable for the IORING_OP_LINKAT flag handling"
      
      * tag 'io_uring-6.6-2023-09-28' of git://git.kernel.dk/linux:
        io_uring/fs: remove sqe->rw_flags checking from LINKAT
      a98b9595
    • Linus Torvalds's avatar
      Merge tag 'slab-fixes-for-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab · 1c84724c
      Linus Torvalds authored
      Pull slab fixes from Vlastimil Babka:
      
       - stable fix to prevent list corruption when destroying caches with
         leftover objects (Rafael Aquini)
      
       - fix for a gotcha in kmalloc_size_roundup() when calling it with too
         high size, discovered when recently a networking call site had to be
         fixed for a different issue (David Laight)
      
      * tag 'slab-fixes-for-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
        slab: kmalloc_size_roundup() must not return 0 for non-zero size
        mm/slab_common: fix slab_caches list corruption after kmem_cache_destroy()
      1c84724c
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2023-09-29' of git://anongit.freedesktop.org/drm/drm · 6edc84bc
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular pull, this feel suspiciously light so I expect next week might
        be a bit heavier? Let's see how we go. This is from a code point of
        view ivpu and i915 fixes.
      
        The only other patch is adding Danilo Krummrich to the nouveau
        maintainers, he's agreed to take on more of the roll after Ben
        retired.
      
        MAINTAINERS:
         - add Danilo for nouveau
      
        ivpu:
         - Add PCI ids for Arrow Lake
         - Fix memory corruption during IPC
         - Avoid dmesg flooding
         - 40xx: Wait for clock resource
         - 40xx: Fix interrupt usage
         - 40xx: Support caching when loading firmware
      
        i915:
         - Fix a panic regression on gen8_ggtt_insert_entries
         - Fix load issue due to reservation address in ggtt_reserve_guc_top
         - Fix a possible deadlock with guc busyness worker"
      
      * tag 'drm-fixes-2023-09-29' of git://anongit.freedesktop.org/drm/drm:
        accel/ivpu: Use cached buffers for FW loading
        accel/ivpu/40xx: Fix missing VPUIP interrupts
        accel/ivpu/40xx: Disable frequency change interrupt
        accel/ivpu/40xx: Ensure clock resource ownership Ack before Power-Up
        accel/ivpu: Don't flood dmesg with VPU ready message
        accel/ivpu: Do not use wait event interruptible
        MAINTAINERS: update nouveau maintainers
        i915/guc: Get runtime pm in busyness worker only if already active
        drm/i915/gt: Fix reservation address in ggtt_reserve_guc_top
        i915: Limit the length of an sg list to the requested length
        accel/ivpu: Add Arrow Lake pci id
      6edc84bc
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 71e58659
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix a potential spinlock deadlock in gpio-timberdale
      
       - mark the gpio-pmic-eic-sprd driver as one that can sleep
      
      * tag 'gpio-fixes-for-v6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip
        gpio: timberdale: Fix potential deadlock on &tgpio->lock
      71e58659
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · acfdcaee
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A bunch of clk driver fixes for issues found recently:
      
         - Fix the binding for versaclock3 that was introduced this merge
           window so we know what the values are for clk consumers
      
         - Fix a 64-bit division issue in the versaclock3 driver
      
         - Avoid breakage in the versaclock3 driver by rejiggering the enums
           used to layout clks
      
         - Fix the parent name of a clk in the Spreadtrum ums512 clk driver
      
         - Fix a suspend/resume issue in Skyworks Si521xx clk driver where
           regmap restoration fails because writes are wedged
      
         - Return zero from Tegra bpmp recalc_rate() implementation when an
           error occurs so we don't consider an error as a large rate"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: tegra: fix error return case for recalc_rate
        clk: si521xx: Fix regmap write accessor
        clk: si521xx: Use REGCACHE_FLAT instead of NONE
        clk: sprd: Fix thm_parents incorrect configuration
        clk: vc3: Make vc3_clk_mux enum values based on vc3_clk enum values
        clk: vc3: Fix output clock mapping
        clk: vc3: Fix 64 by 64 division
        dt-bindings: clock: versaclock3: Add description for #clock-cells property
      acfdcaee
    • Linus Torvalds's avatar
      Merge tag 'for-v6.6-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply · 94b7ed38
      Linus Torvalds authored
      Pull power supply fixes from Sebastian Reichel:
      
       - core: fix use after free during device release
      
       - ab8500: avoid reporting multiple batteries to userspace
      
       - rk817: fix DT node resource leak
      
       - misc. small fixes, mostly for compiler warnings/errors
      
      * tag 'for-v6.6-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
        power: supply: rk817: Fix node refcount leak
        power: supply: core: fix use after free in uevent
        power: supply: rt9467: Fix rt9467_run_aicl()
        power: supply: rk817: Add missing module alias
        power: supply: ucs1002: fix error code in ucs1002_get_property()
        power: vexpress: fix -Wvoid-pointer-to-enum-cast warning
        power: reset: use capital "OR" for multiple licenses in SPDX
        pwr-mlxbf: extend Kconfig to include gpio-mlxbf3 dependency
        power: supply: rt5033_charger: recognize EXTCON setting
        power: supply: mt6370: Fix missing error code in mt6370_chg_toggle_cfo()
        power: supply: ab8500: Set typing and props
      94b7ed38
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20230928' of https://github.com/jcmvbkbc/linux-xtensa · b02afe1d
      Linus Torvalds authored
      Pull Xtensa fixes from Max Filippov:
      
       - fix build warnings from builds performed with W=1
      
      * tag 'xtensa-20230928' of https://github.com/jcmvbkbc/linux-xtensa:
        xtensa: boot/lib: fix function prototypes
        xtensa: umulsidi3: fix conditional expression
        xtensa: boot: don't add include-dirs
        xtensa: iss/network: make functions static
        xtensa: tlb: include <asm/tlb.h> for missing prototype
        xtensa: hw_breakpoint: include header for missing prototype
        xtensa: smp: add headers for missing function prototypes
        irqchip: irq-xtensa-mx: include header for missing prototype
        xtensa: traps: add <linux/cpu.h> for function prototype
        xtensa: stacktrace: include <asm/ftrace.h> for prototype
        xtensa: signal: include headers for function prototypes
        xtensa: processor.h: add init_arch() prototype
        xtensa: ptrace: add prototypes to <asm/ptrace.h>
        xtensa: irq: include <asm/traps.h>
        xtensa: fault: include <asm/traps.h>
        xtensa: add default definition for XCHAL_HAVE_DIV32
      b02afe1d
    • Jens Axboe's avatar
      io_uring/fs: remove sqe->rw_flags checking from LINKAT · a52d4f65
      Jens Axboe authored
      This is unionized with the actual link flags, so they can of course be
      set and they will be evaluated further down. If not we fail any LINKAT
      that has to set option flags.
      
      Fixes: cf30da90 ("io_uring: add support for IORING_OP_LINKAT")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarThomas Leonard <talex5@gmail.com>
      Link: https://github.com/axboe/liburing/issues/955Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a52d4f65
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2023-09-28' of... · 06365a04
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2023-09-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - Fix a panic regression on gen8_ggtt_insert_entries (Matthew Wilcox)
      - Fix load issue due to reservation address in ggtt_reserve_guc_top (Javier Pello)
      - Fix a possible deadlock with guc busyness worker (Umesh)
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/ZRWMI1HmUYPGGylp@intel.com
      06365a04
  2. 28 Sep, 2023 15 commits
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2023-09-28' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 8c4a5e89
      Dave Airlie authored
      Short summary of fixes pull:
      
       * ivpu:
         * Add PCI ids for Arrow Lake
         * Fix memory corruption during IPC
         * Avoid dmesg flooding
         * 40xx: Wait for clock resource
         * 40xx: Fix interrupt usage
         * 40xx: Support caching when loading firmware
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Thomas Zimmermann <tzimmermann@suse.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/20230928081208.GA7881@linux-uq9g
      8c4a5e89
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 9ed22ae6
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A small set of device specific fixes, the most major one is for the
        GXP driver which would probably have been confusing some callers with
        returning the length rather than 0 on successful writes"
      
      * tag 'spi-fix-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: spi-gxp: BUG: Correct spi write return value
        dt-bindings: spi: fsl-imx-cspi: Document missing entries
        spi: cs42l43: Remove spurious pm_runtime_disable
      9ed22ae6
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.6-2' of... · 5d959343
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
       "Fix high_memory calculation and module loader errors with latest
        binutils"
      
      * tag 'loongarch-fixes-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: Add support for 64_PCREL relocation type
        LoongArch: Add support for 32_PCREL relocation type
        LoongArch: Define relocation types for ABI v2.10
        LoongArch: numa: Fix high_memory calculation
      5d959343
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes_6.6_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 52a6d9b5
      Linus Torvalds authored
      Pull MIPS fix from Thomas Bogendoerfer:
      
       - fix Alchemy build with MMC support disabled
      
      * tag 'mips-fixes_6.6_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled
      52a6d9b5
    • Damien Le Moal's avatar
      ata: libata-eh: Fix compilation warning in ata_eh_link_report() · 49728bdc
      Damien Le Moal authored
      The 6 bytes length of the tries_buf string in ata_eh_link_report() is
      too short and results in a gcc compilation warning with W-!:
      
      drivers/ata/libata-eh.c: In function ‘ata_eh_link_report’:
      drivers/ata/libata-eh.c:2371:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 4 [-Wformat-truncation=]
       2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
            |                                                           ^~
      drivers/ata/libata-eh.c:2371:56: note: directive argument in the range [-2147483648, 4]
       2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
            |                                                        ^~~~~~
      drivers/ata/libata-eh.c:2371:17: note: ‘snprintf’ output between 4 and 14 bytes into a destination of size 6
       2371 |                 snprintf(tries_buf, sizeof(tries_buf), " t%d",
            |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       2372 |                          ap->eh_tries);
            |                          ~~~~~~~~~~~~~
      
      Avoid this warning by increasing the string size to 16B.
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      49728bdc
    • Damien Le Moal's avatar
      ata: libata-core: Fix compilation warning in ata_dev_config_ncq() · ed518d9b
      Damien Le Moal authored
      The 24 bytes length allocated to the ncq_desc string in
      ata_dev_config_lba() for ata_dev_config_ncq() to use is too short,
      causing the following gcc compilation warnings when compiling with W=1:
      
      drivers/ata/libata-core.c: In function ‘ata_dev_configure’:
      drivers/ata/libata-core.c:2378:56: warning: ‘%d’ directive output may be truncated writing between 1 and 2 bytes into a region of size between 1 and 11 [-Wformat-truncation=]
       2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
            |                                                        ^~
      In function ‘ata_dev_config_ncq’,
          inlined from ‘ata_dev_config_lba’ at drivers/ata/libata-core.c:2649:8,
          inlined from ‘ata_dev_configure’ at drivers/ata/libata-core.c:2952:9:
      drivers/ata/libata-core.c:2378:41: note: directive argument in the range [1, 32]
       2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
            |                                         ^~~~~~~~~~~~~~~~~~~~~
      drivers/ata/libata-core.c:2378:17: note: ‘snprintf’ output between 16 and 31 bytes into a destination of size 24
       2378 |                 snprintf(desc, desc_sz, "NCQ (depth %d/%d)%s", hdepth,
            |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       2379 |                         ddepth, aa_desc);
            |                         ~~~~~~~~~~~~~~~~
      
      Avoid these warnings and the potential truncation by changing the size
      of the ncq_desc string to 32 characters.
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      ed518d9b
    • Damien Le Moal's avatar
      scsi: sd: Do not issue commands to suspended disks on shutdown · 99398d20
      Damien Le Moal authored
      If an error occurs when resuming a host adapter before the devices
      attached to the adapter are resumed, the adapter low level driver may
      remove the scsi host, resulting in a call to sd_remove() for the
      disks of the host. This in turn results in a call to sd_shutdown() which
      will issue a synchronize cache command and a start stop unit command to
      spindown the disk. sd_shutdown() issues the commands only if the device
      is not already runtime suspended but does not check the power state for
      system-wide suspend/resume. That is, the commands may be issued with the
      device in a suspended state, which causes PM resume to hang, forcing a
      reset of the machine to recover.
      
      Fix this by tracking the suspended state of a disk by introducing the
      suspended boolean field in the scsi_disk structure. This flag is set to
      true when the disk is suspended is sd_suspend_common() and resumed with
      sd_resume(). When suspended is true, sd_shutdown() is not executed from
      sd_remove().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      99398d20
    • Damien Le Moal's avatar
      ata: libata-core: Do not register PM operations for SAS ports · 75e2bd5f
      Damien Le Moal authored
      libsas does its own domain based power management of ports. For such
      ports, libata should not use a device type defining power management
      operations as executing these operations for suspend/resume in addition
      to libsas calls to ata_sas_port_suspend() and ata_sas_port_resume() is
      not necessary (and likely dangerous to do, even though problems are not
      seen currently).
      
      Introduce the new ata_port_sas_type device_type for ports managed by
      libsas. This new device type is used in ata_tport_add() and is defined
      without power management operations.
      
      Fixes: 2fcbdcb4 ("[SCSI] libata: export ata_port suspend/resume infrastructure for sas")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Tested-by: default avatarChia-Lin Kao (AceLan) <acelan.kao@canonical.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      75e2bd5f
    • Damien Le Moal's avatar
      ata: libata-scsi: Fix delayed scsi_rescan_device() execution · 8b4d9469
      Damien Le Moal authored
      Commit 6aa0365a ("ata: libata-scsi: Avoid deadlock on rescan after
      device resume") modified ata_scsi_dev_rescan() to check the scsi device
      "is_suspended" power field to ensure that the scsi device associated
      with an ATA device is fully resumed when scsi_rescan_device() is
      executed. However, this fix is problematic as:
      1) It relies on a PM internal field that should not be used without PM
         device locking protection.
      2) The check for is_suspended and the call to scsi_rescan_device() are
         not atomic and a suspend PM event may be triggered between them,
         casuing scsi_rescan_device() to be called on a suspended device and
         in that function blocking while holding the scsi device lock. This
         would deadlock a following resume operation.
      These problems can trigger PM deadlocks on resume, especially with
      resume operations triggered quickly after or during suspend operations.
      E.g., a simple bash script like:
      
      for (( i=0; i<10; i++ )); do
      	echo "+2 > /sys/class/rtc/rtc0/wakealarm
      	echo mem > /sys/power/state
      done
      
      that triggers a resume 2 seconds after starting suspending a system can
      quickly lead to a PM deadlock preventing the system from correctly
      resuming.
      
      Fix this by replacing the check on is_suspended with a check on the
      return value given by scsi_rescan_device() as that function will fail if
      called against a suspended device. Also make sure rescan tasks already
      scheduled are first cancelled before suspending an ata port.
      
      Fixes: 6aa0365a ("ata: libata-scsi: Avoid deadlock on rescan after device resume")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      8b4d9469
    • Damien Le Moal's avatar
      scsi: Do not attempt to rescan suspended devices · ff48b378
      Damien Le Moal authored
      scsi_rescan_device() takes a scsi device lock before executing a device
      handler and device driver rescan methods. Waiting for the completion of
      any command issued to the device by these methods will thus be done with
      the device lock held. As a result, there is a risk of deadlocking within
      the power management code if scsi_rescan_device() is called to handle a
      device resume with the associated scsi device not yet resumed.
      
      Avoid such situation by checking that the target scsi device is in the
      running state, that is, fully capable of executing commands, before
      proceeding with the rescan and bailout returning -EWOULDBLOCK otherwise.
      With this error return, the caller can retry rescaning the device after
      a delay.
      
      The state check is done with the device lock held and is thus safe
      against incoming suspend power management operations.
      
      Fixes: 6aa0365a ("ata: libata-scsi: Avoid deadlock on rescan after device resume")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      ff48b378
    • Damien Le Moal's avatar
      ata: libata-scsi: Disable scsi device manage_system_start_stop · aa3998db
      Damien Le Moal authored
      The introduction of a device link to create a consumer/supplier
      relationship between the scsi device of an ATA device and the ATA port
      of that ATA device fixes the ordering of system suspend and resume
      operations. For suspend, the scsi device is suspended first and the ata
      port after it. This is fine as this allows the synchronize cache and
      START STOP UNIT commands issued by the scsi disk driver to be executed
      before the ata port is disabled.
      
      For resume operations, the ata port is resumed first, followed
      by the scsi device. This allows having the request queue of the scsi
      device to be unfrozen after the ata port resume is scheduled in EH,
      thus avoiding to see new requests prematurely issued to the ATA device.
      Since libata sets manage_system_start_stop to 1, the scsi disk resume
      operation also results in issuing a START STOP UNIT command to the
      device being resumed so that the device exits standby power mode.
      
      However, restoring the ATA device to the active power mode must be
      synchronized with libata EH processing of the port resume operation to
      avoid either 1) seeing the start stop unit command being received too
      early when the port is not yet resumed and ready to accept commands, or
      after the port resume process issues commands such as IDENTIFY to
      revalidate the device. In this last case, the risk is that the device
      revalidation fails with timeout errors as the drive is still spun down.
      
      Commit 0a858905 ("ata,scsi: do not issue START STOP UNIT on resume")
      disabled issuing the START STOP UNIT command to avoid issues with it.
      But this is incorrect as transitioning a device to the active power
      mode from the standby power mode set on suspend requires a media access
      command. The IDENTIFY, READ LOG and SET FEATURES commands executed in
      libata EH context triggered by the ata port resume operation may thus
      fail.
      
      Fix these synchronization issues is by handling a device power mode
      transitions for system suspend and resume directly in libata EH context,
      without relying on the scsi disk driver management triggered with the
      manage_system_start_stop flag.
      
      To do this, the following libata helper functions are introduced:
      
      1) ata_dev_power_set_standby():
      
      This function issues a STANDBY IMMEDIATE command to transitiom a device
      to the standby power mode. For HDDs, this spins down the disks. This
      function applies only to ATA and ZAC devices and does nothing otherwise.
      This function also does nothing for devices that have the
      ATA_FLAG_NO_POWEROFF_SPINDOWN or ATA_FLAG_NO_HIBERNATE_SPINDOWN flag
      set.
      
      For suspend, call ata_dev_power_set_standby() in
      ata_eh_handle_port_suspend() before the port is disabled and frozen.
      ata_eh_unload() is also modified to transition all enabled devices to
      the standby power mode when the system is shutdown or devices removed.
      
      2) ata_dev_power_set_active() and
      
      This function applies to ATA or ZAC devices and issues a VERIFY command
      for 1 sector at LBA 0 to transition the device to the active power mode.
      For HDDs, since this function will complete only once the disk spin up.
      Its execution uses the same timeouts as for reset, to give the drive
      enough time to complete spinup without triggering a command timeout.
      
      For resume, call ata_dev_power_set_active() in
      ata_eh_revalidate_and_attach() after the port has been enabled and
      before any other command is issued to the device.
      
      With these changes, the manage_system_start_stop and no_start_on_resume
      scsi device flags do not need to be set in ata_scsi_dev_config(). The
      flag manage_runtime_start_stop is still set to allow the sd driver to
      spinup/spindown a disk through the sd runtime operations.
      
      Fixes: 0a858905 ("ata,scsi: do not issue START STOP UNIT on resume")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      aa3998db
    • Damien Le Moal's avatar
      scsi: sd: Differentiate system and runtime start/stop management · 3cc2ffe5
      Damien Le Moal authored
      The underlying device and driver of a SCSI disk may have different
      system and runtime power mode control requirements. This is because
      runtime power management affects only the SCSI disk, while system level
      power management affects all devices, including the controller for the
      SCSI disk.
      
      For instance, issuing a START STOP UNIT command when a SCSI disk is
      runtime suspended and resumed is fine: the command is translated to a
      STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY
      command to wake it up. The SCSI disk runtime operations have no effect
      on the ata port device used to connect the ATA disk. However, for
      system suspend/resume operations, the ATA port used to connect the
      device will also be suspended and resumed, with the resume operation
      requiring re-validating the device link and the device itself. In this
      case, issuing a VERIFY command to spinup the disk must be done before
      starting to revalidate the device, when the ata port is being resumed.
      In such case, we must not allow the SCSI disk driver to issue START STOP
      UNIT commands.
      
      Allow a low level driver to refine the SCSI disk start/stop management
      by differentiating system and runtime cases with two new SCSI device
      flags: manage_system_start_stop and manage_runtime_start_stop. These new
      flags replace the current manage_start_stop flag. Drivers setting the
      manage_start_stop are modifed to set both new flags, thus preserving the
      existing start/stop management behavior. For backward compatibility, the
      old manage_start_stop sysfs device attribute is kept as a read-only
      attribute showing a value of 1 for devices enabling both new flags and 0
      otherwise.
      
      Fixes: 0a858905 ("ata,scsi: do not issue START STOP UNIT on resume")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      3cc2ffe5
    • Damien Le Moal's avatar
      ata: libata-scsi: link ata port and scsi device · fb99ef17
      Damien Le Moal authored
      There is no direct device ancestry defined between an ata_device and
      its scsi device which prevents the power management code from correctly
      ordering suspend and resume operations. Create such ancestry with the
      ata device as the parent to ensure that the scsi device (child) is
      suspended before the ata device and that resume handles the ata device
      before the scsi device.
      
      The parent-child (supplier-consumer) relationship is established between
      the ata_port (parent) and the scsi device (child) with the function
      device_add_link(). The parent used is not the ata_device as the PM
      operations are defined per port and the status of all devices connected
      through that port is controlled from the port operations.
      
      The device link is established with the new function
      ata_scsi_slave_alloc(), and this function is used to define the
      ->slave_alloc callback of the scsi host template of all ata drivers.
      
      Fixes: a19a93e4 ("scsi: core: pm: Rely on the device driver core for async power management")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      fb99ef17
    • Damien Le Moal's avatar
      ata: libata-core: Fix port and device removal · 84d76529
      Damien Le Moal authored
      Whenever an ATA adapter driver is removed (e.g. rmmod),
      ata_port_detach() is called repeatedly for all the adapter ports to
      remove (unload) the devices attached to the port and delete the port
      device itself. Removing of devices is done using libata EH with the
      ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute
      ata_eh_unload() which disables all devices attached to the port.
      
      ata_port_detach() finishes by calling scsi_remove_host() to remove the
      scsi host associated with the port. This function will trigger the
      removal of all scsi devices attached to the host and in the case of
      disks, calls to sd_shutdown() which will flush the device write cache
      and stop the device. However, given that the devices were already
      disabled by ata_eh_unload(), the synchronize write cache command and
      start stop unit commands fail. E.g. running "rmmod ahci" with first
      removing sd_mod results in error messages like:
      
      ata13.00: disable device
      sd 0:0:0:0: [sda] Synchronizing SCSI cache
      sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      sd 0:0:0:0: [sda] Stopping disk
      sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      
      Fix this by removing all scsi devices of the ata devices connected to
      the port before scheduling libata EH to disable the ATA devices.
      
      Fixes: 720ba126 ("[PATCH] libata-hp: update unload-unplug")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Tested-by: default avatarChia-Lin Kao (AceLan) <acelan.kao@canonical.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      84d76529
    • Damien Le Moal's avatar
      ata: libata-core: Fix ata_port_request_pm() locking · 3b8e0af4
      Damien Le Moal authored
      The function ata_port_request_pm() checks the port flag
      ATA_PFLAG_PM_PENDING and calls ata_port_wait_eh() if this flag is set to
      ensure that power management operations for a port are not scheduled
      simultaneously. However, this flag check is done without holding the
      port lock.
      
      Fix this by taking the port lock on entry to the function and checking
      the flag under this lock. The lock is released and re-taken if
      ata_port_wait_eh() needs to be called. The two WARN_ON() macros checking
      that the ATA_PFLAG_PM_PENDING flag was cleared are removed as the first
      call is racy and the second one done without holding the port lock.
      
      Fixes: 5ef41082 ("ata: add ata port system PM callbacks")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Tested-by: default avatarChia-Lin Kao (AceLan) <acelan.kao@canonical.com>
      Reviewed-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Tested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      3b8e0af4
  3. 27 Sep, 2023 12 commits