Commits · 502b291568fc7faf1ebdb2c2590f12851db0ff76 · nexedi / linux

08 Oct, 2018 2 commits

bcache: trace missed reading by cache_missed · 502b2915

Tang Junhui authored Oct 08, 2018

Missed reading IOs are identified by s->cache_missed, not the
s->cache_miss, so in trace_bcache_read() using trace_bcache_read
to identify whether the IO is missed or not.
Signed-off-by: Tang Junhui <tang.junhui.linux@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

502b2915

bcache: account size of buckets used in uuid write to ca->meta_sectors_written · 7a55948d

Shenghui Wang authored Oct 08, 2018

UUIDs are considered as metadata. __uuid_write should add the number
of buckets (in sectors) written to disk to ca->meta_sectors_written.
Currently only 1 bucket is used in uuid write.

Steps to test:
1) create a fresh backing device and a fresh cache device separately.
   The backing device didn't attach to any cache set.
2) cd /sys/block/<cache device>/bcache
   cat metadata_written      // record the output value
   cat bucket_size
3) attach the backing device to cache set
4) cat metadata_written
   The output value is almost the same as the value in step 2
   before the change.
   After the change, the value is bigger about 1 bucket size.
Signed-off-by: Shenghui Wang <shhuiw@foxmail.com>
Reviewed-by: Tang Junhui <tang.junhui.linux@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7a55948d

05 Oct, 2018 3 commits

blk-mq-debugfs: Also show requests that have not yet been started · 6d8623a7

Bart Van Assche authored Oct 04, 2018

When debugging e.g. the SCSI timeout handler it is important that
requests that have not yet been started or that already have
completed are also reported through debugfs.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6d8623a7

Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-4.20/block · 4f5735f3

Jens Axboe authored Oct 05, 2018

Pull NVMe updates from Christoph:

"A relatively boring merge window:

 - better AEN tracing (Chaitanya)
 - NUMA aware PCIe multipathing (me)
 - RDMA workqueue fixes (Sagi)
 - better bio usage in the target (Sagi)
 - FC rework for target removal (James)
 - better multipath handling of ->queue_rq failures (James)
 - various cleanups (Milan)"

* 'nvme-4.20' of git://git.infradead.org/nvme:
  nvmet-rdma: use a private workqueue for delete
  nvme: take node locality into account when selecting a path
  nvmet: don't split large I/Os unconditionally
  nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O
  nvme-core: add async event trace helper
  nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device
  nvmet_fc: support target port removal with nvmet layer
  nvme-fc: fix for a minor typos
  nvmet: remove redundant module prefix
  nvme: fix typo in nvme_identify_ns_descs

4f5735f3

nvmet-rdma: use a private workqueue for delete · 2acf70ad

Sagi Grimberg authored Sep 27, 2018

Queue deletion is done asynchronous when the last reference on the queue
is dropped.  Thus, in order to make sure we don't over allocate under a
connect/disconnect storm, we let queue deletion complete before making
forward progress.

However, given that we flush the system_wq from rdma_cm context which
runs from a workqueue context, we can have a circular locking complaint
[1]. Fix that by using a private workqueue for queue deletion.

[1]:
======================================================
WARNING: possible circular locking dependency detected
4.19.0-rc4-dbg+ #3 Not tainted
------------------------------------------------------
kworker/5:0/39 is trying to acquire lock:
00000000a10b6db9 (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x6f/0x440 [rdma_cm]

but task is already holding lock:
00000000331b4e2c ((work_completion)(&queue->release_work)){+.+.}, at: process_one_work+0x3ed/0xa20

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(&queue->release_work)){+.+.}:
       process_one_work+0x474/0xa20
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

-> #2 ((wq_completion)"events"){+.+.}:
       flush_workqueue+0xf3/0x970
       nvmet_rdma_cm_handler+0x133d/0x1734 [nvmet_rdma]
       cma_ib_req_handler+0x72f/0xf90 [rdma_cm]
       cm_process_work+0x2e/0x110 [ib_cm]
       cm_req_handler+0x135b/0x1c30 [ib_cm]
       cm_work_handler+0x2b7/0x38cd [ib_cm]
       process_one_work+0x4ae/0xa20
nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: disconnected (10): status 0 id 0000000040357082
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30
nvme nvme0: Reconnecting in 10 seconds...

-> #1 (&id_priv->handler_mutex/1){+.+.}:
       __mutex_lock+0xfe/0xbe0
       mutex_lock_nested+0x1b/0x20
       cma_ib_req_handler+0x6aa/0xf90 [rdma_cm]
       cm_process_work+0x2e/0x110 [ib_cm]
       cm_req_handler+0x135b/0x1c30 [ib_cm]
       cm_work_handler+0x2b7/0x38cd [ib_cm]
       process_one_work+0x4ae/0xa20
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

-> #0 (&id_priv->handler_mutex){+.+.}:
       lock_acquire+0xc5/0x200
       __mutex_lock+0xfe/0xbe0
       mutex_lock_nested+0x1b/0x20
       rdma_destroy_id+0x6f/0x440 [rdma_cm]
       nvmet_rdma_release_queue_work+0x8e/0x1b0 [nvmet_rdma]
       process_one_work+0x4ae/0xa20
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

Fixes: 777dc823 ("nvmet-rdma: occasionally flush ongoing controller teardown")
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>

2acf70ad

03 Oct, 2018 2 commits

block: Finish renaming REQ_DISCARD into REQ_OP_DISCARD · 9305455a

Bart Van Assche authored Oct 03, 2018

Some time ago REQ_DISCARD was renamed into REQ_OP_DISCARD. Some comments
and documentation files were not updated however. Update these comments
and documentation files. See also commit 4e1b2d52 ("block, fs,
drivers: remove REQ_OP compat defs and related code").
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

9305455a

cdrom: fix improper type cast, which can leat to information leak. · e4f3aa2e

Young_X authored Oct 03, 2018

There is another cast from unsigned long to int which causes
a bounds check to fail with specially crafted input. The value is
then used as an index in the slot array in cdrom_slot_status().

This issue is similar to CVE-2018-16658 and CVE-2018-10940.
Signed-off-by: Young_X <YangX92@hotmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

e4f3aa2e

02 Oct, 2018 1 commit

pktcdvd: fix fall-through annotation · fb6360b1

Gustavo A. R. Silva authored Oct 02, 2018

Replace "fallthru" with a proper "fall through" annotation.

This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fb6360b1

01 Oct, 2018 10 commits

nvme: take node locality into account when selecting a path · f3334447

Christoph Hellwig authored Sep 11, 2018

Make current_path an array with an entry for every possible node, and
cache the best path on a per-node basis.  Take the node distance into
account when selecting it.  This is primarily useful for dual-ported PCIe
devices which are connected to PCIe root ports on different sockets.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>

f3334447

nvmet: don't split large I/Os unconditionally · 73383adf

Sagi Grimberg authored Sep 28, 2018

If we know that the I/O size exceeds our inline bio vec, no
point using it and split the rest to begin with. We could
in theory reuse the inline bio and only allocate the bio_vec,
but its really not worth optimizing for.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

73383adf

nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O · 783f4a44

James Smart authored Sep 27, 2018

When an io is rejected by nvmf_check_ready() due to validation of the
controller state, the nvmf_fail_nonready_command() will normally return
BLK_STS_RESOURCE to requeue and retry.  However, if the controller is
dying or the I/O is marked for NVMe multipath, the I/O is failed so that
the controller can terminate or so that the io can be issued on a
different path.  Unfortunately, as this reject point is before the
transport has accepted the command, blk-mq ends up completing the I/O
and never calls nvme_complete_rq(), which is where multipath may preserve
or re-route the I/O. The end result is, the device user ends up seeing an
EIO error.

Example: single path connectivity, controller is under load, and a reset
is induced.  An I/O is received:

  a) while the reset state has been set but the queues have yet to be
     stopped; or
  b) after queues are started (at end of reset) but before the reconnect
     has completed.

The I/O finishes with an EIO status.

This patch makes the following changes:

  - Adds the HOST_PATH_ERROR pathing status from TP4028
  - Modifies the reject point such that it appears to queue successfully,
    but actually completes the io with the new pathing status and calls
    nvme_complete_rq().
  - nvme_complete_rq() recognizes the new status, avoids resetting the
    controller (likely was already done in order to get this new status),
    and calls the multipather to clear the current path that errored.
    This allows the next command (retry or new command) to select a new
    path if there is one.
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

783f4a44

nvme-core: add async event trace helper · 09bd1ff4

Chaitanya Kulkarni authored Sep 17, 2018

This patch adds a new event for nvme async event notification.
We print the async event in the decoded format when we recognize
the event otherwise we just dump the result.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

09bd1ff4

nvme_fc: add 'nvme_discovery' sysfs attribute to fc transport device · 97faec53

James Smart authored Sep 13, 2018

The fc transport device should allow for a rediscovery, as userspace
might have lost the events. Example is udev events not handled during
system startup.

This patch add a sysfs entry 'nvme_discovery' on the fc class to
have it replay all udev discovery events for all local port/remote
port address pairs.
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

97faec53

nvmet_fc: support target port removal with nvmet layer · ea96d649

James Smart authored Aug 09, 2018

Currently, if a targetport has been connected to via the nvmet config
(in other words, the add_port() transport routine called, and the nvmet
port pointer stored for using in upcalls on new io), and if the
targetport is then removed (say the lldd driver decides to unload or
fully reset its hardware) and then re-added (the lldd driver reloads or
reinits its hardware), the port pointer has been lost so there's no way
to continue to post commands up to nvmet via the transport port.

Correct by allocating a small "port context" structure that will be
linked to by the targetport. The context will save the targetport WWN's
and the nvmet port pointer to use for it. Initial allocation will occur
when the targetport is bound to via add_port. The context will be
deallocated when remove_port() is called. If a targetport is removed
while nvmet has the active port context, the targetport will be unlinked
from the port context before removal. If a new targetport is registered,
the port contexts without a binding are looked through and if the WWN's
match (so it's the same as nvmet's port context) the port context is
linked to the new target port. Thus new io can be received on the new
targetport and operation resumes with nvmet.

Additionally, this also resolves nvmet configuration changing out from
underneath of the nvme-fc target port (for example: a nvmetcli clear).
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

ea96d649

nvme-fc: fix for a minor typos · d4e4230c

Milan P. Gandhi authored Aug 10, 2018

Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

d4e4230c

nvmet: remove redundant module prefix · d93cb392

Chaitanya Kulkarni authored Sep 10, 2018

This patch removes the redundant module prefix used in the pr_err() when
nvmet_get_smart_log_nsid() failed to find the namespace provided as a part
of smart-log command.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

d93cb392

nvme: fix typo in nvme_identify_ns_descs · 53b3a661

Milan P. Gandhi authored Aug 09, 2018

Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

53b3a661

Merge tag 'v4.19-rc6' into for-4.20/block · c0aac682

Jens Axboe authored Oct 01, 2018

Merge -rc6 in, for two reasons:

1) Resolve a trivial conflict in the blk-mq-tag.c documentation
2) A few important regression fixes went into upstream directly, so
   they aren't in the 4.20 branch.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

* tag 'v4.19-rc6': (780 commits)
  Linux 4.19-rc6
  MAINTAINERS: fix reference to moved drivers/{misc => auxdisplay}/panel.c
  cpufreq: qcom-kryo: Fix section annotations
  perf/core: Add sanity check to deal with pinned event failure
  xen/blkfront: correct purging of persistent grants
  Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
  selftests/powerpc: Fix Makefiles for headers_install change
  blk-mq: I/O and timer unplugs are inverted in blktrace
  dax: Fix deadlock in dax_lock_mapping_entry()
  x86/boot: Fix kexec booting failure in the SEV bit detection code
  bcache: add separate workqueue for journal_write to avoid deadlock
  drm/amd/display: Fix Edid emulation for linux
  drm/amd/display: Fix Vega10 lightup on S3 resume
  drm/amdgpu: Fix vce work queue was not cancelled when suspend
  Revert "drm/panel: Add device_link from panel device to DRM device"
  xen/blkfront: When purging persistent grants, keep them in the buffer
  clocksource/drivers/timer-atmel-pit: Properly handle error cases
  block: fix deadline elevator drain for zoned block devices
  ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
  drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
  ...
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c0aac682

30 Sep, 2018 4 commits

Linux 4.19-rc6 · 17b57b18
Greg Kroah-Hartman authored Sep 30, 2018

17b57b18

Merge tag 'auxdisplay-for-greg-v4.19-rc6' of https://github.com/ojeda/linux · 9a10b063

Greg Kroah-Hartman authored Sep 30, 2018

Miguel writes:
  "A trivial fix for auxdisplay

    - MAINTAINERS reference fix for moved file
      Reported by Joe Perches"

* tag 'auxdisplay-for-greg-v4.19-rc6' of https://github.com/ojeda/linux:
  MAINTAINERS: fix reference to moved drivers/{misc => auxdisplay}/panel.c

9a10b063

Merge tag 'libnvdimm-fixes2-4.19-rc6' of... · 9ba6873e

Greg Kroah-Hartman authored Sep 30, 2018

Merge tag 'libnvdimm-fixes2-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Dan writes:
  "filesystem-dax for 4.19-rc6

   Fix a deadlock in the new for 4.19 dax_lock_mapping_entry() routine."

* tag 'libnvdimm-fixes2-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Fix deadlock in dax_lock_mapping_entry()

9ba6873e

MAINTAINERS: fix reference to moved drivers/{misc => auxdisplay}/panel.c · 03d179a8

Miguel Ojeda authored Sep 30, 2018

Commit 51c1e9b5 ("auxdisplay: Move panel.c to drivers/auxdisplay folder")
moved the file, but the MAINTAINERS reference was not updated.

Link: https://lore.kernel.org/lkml/20180928220131.31075-1-joe@perches.com/Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>

03d179a8

29 Sep, 2018 12 commits

Merge tag 'for-linus-20180929' of git://git.kernel.dk/linux-block · 291d0e5d

Greg Kroah-Hartman authored Sep 29, 2018

Jens writes:
  "Block fixes for 4.19-rc6

   A set of fixes that should go into this release. This pull request
   contains:

   - A fix (hopefully) for the persistent grants for xen-blkfront. A
     previous fix from this series wasn't complete, hence reverted, and
     this one should hopefully be it. (Boris Ostrovsky)

   - Fix for an elevator drain warning with SMR devices, which is
     triggered when you switch schedulers (Damien)

   - bcache deadlock fix (Guoju Fang)

   - Fix for the block unplug tracepoint, which has had the
     timer/explicit flag reverted since 4.11 (Ilya)

   - Fix a regression in this series where the blk-mq timeout hook is
     invoked with the RCU read lock held, hence preventing it from
     blocking (Keith)

   - NVMe pull from Christoph, with a single multipath fix (Susobhan Dey)"

* tag 'for-linus-20180929' of git://git.kernel.dk/linux-block:
  xen/blkfront: correct purging of persistent grants
  Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
  blk-mq: I/O and timer unplugs are inverted in blktrace
  bcache: add separate workqueue for journal_write to avoid deadlock
  xen/blkfront: When purging persistent grants, keep them in the buffer
  block: fix deadline elevator drain for zoned block devices
  blk-mq: Allow blocking queue tag iter callbacks
  nvme: properly propagate errors in nvme_mpath_init

291d0e5d

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7541773

Greg Kroah-Hartman authored Sep 29, 2018

Thomas writes:
  "A single fix for the AMD memory encryption boot code so it does not
   read random garbage instead of the cached encryption bit when a kexec
   kernel is allocated above the 32bit address limit."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Fix kexec booting failure in the SEV bit detection code

e7541773

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e1ce697d

Greg Kroah-Hartman authored Sep 29, 2018

Thomas writes:
  "Three small fixes for clocksource drivers:
   - Proper error handling in the Atmel PIT driver
   - Add CLOCK_SOURCE_SUSPEND_NONSTOP for TI SoCs so suspend works again
   - Fix the next event function for Facebook Backpack-CMM BMC chips so
     usleep(100) doesnt sleep several milliseconds"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/timer-atmel-pit: Properly handle error cases
  clocksource/drivers/fttmr010: Fix set_next_event handler
  clocksource/drivers/ti-32k: Add CLOCK_SOURCE_SUSPEND_NONSTOP flag for non-am43 SoCs

e1ce697d

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · af17b3aa

Greg Kroah-Hartman authored Sep 29, 2018

Thomas writes:
  "A single fix for a missing sanity check when a pinned event is tried
  to be read on the wrong CPU due to a legit event scheduling failure."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Add sanity check to deal with pinned event failure

af17b3aa

Merge tag 'pm-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 82ec752c

Greg Kroah-Hartman authored Sep 29, 2018

Rafael writes:
  "Power management fix for 4.19-rc6

   Fix incorrect __init and __exit annotations in the Qualcomm
   Kryo cpufreq driver (Nathan Chancellor)."

* tag 'pm-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: qcom-kryo: Fix section annotations

82ec752c

cpufreq: qcom-kryo: Fix section annotations · d51aea13

Nathan Chancellor authored Sep 19, 2018

There is currently a warning when building the Kryo cpufreq driver into
the kernel image:

WARNING: vmlinux.o(.text+0x8aa424): Section mismatch in reference from
the function qcom_cpufreq_kryo_probe() to the function
.init.text:qcom_cpufreq_kryo_get_msm_id()
The function qcom_cpufreq_kryo_probe() references
the function __init qcom_cpufreq_kryo_get_msm_id().
This is often because qcom_cpufreq_kryo_probe lacks a __init
annotation or the annotation of qcom_cpufreq_kryo_get_msm_id is wrong.

Remove the '__init' annotation from qcom_cpufreq_kryo_get_msm_id
so that there is no more mismatch warning.

Additionally, Nick noticed that the remove function was marked as
'__init' when it should really be marked as '__exit'.

Fixes: 46e2856b (cpufreq: Add Kryo CPU scaling driver)
Fixes: 5ad7346b (cpufreq: kryo: Add module remove and exit)
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.18+ <stable@vger.kernel.org> # 4.18+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

d51aea13

Merge tag 'dma-mapping-4.19-3' of git://git.infradead.org/users/hch/dma-mapping · 7a6878bb

Greg Kroah-Hartman authored Sep 29, 2018

Christoph writes:
  "dma mapping fix for 4.19-rc6

   fix a missing Kconfig symbol for commits introduced in 4.19-rc"

* tag 'dma-mapping-4.19-3' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration

7a6878bb

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · e704966c

Greg Kroah-Hartman authored Sep 28, 2018

Dmitry writes:
  "Input updates for v4.19-rc5

   Just a few driver fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: uinput - allow for max == min during input_absinfo validation
  Input: elantech - enable middle button of touchpad on ThinkPad P72
  Input: atakbd - fix Atari CapsLock behaviour
  Input: atakbd - fix Atari keymap
  Input: egalax_ts - add system wakeup support
  Input: gpio-keys - fix a documentation index issue

e704966c

Merge tag 'spi-fix-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 2f19e7a7

Greg Kroah-Hartman authored Sep 28, 2018

Mark writes:
  "spi: Fixes for v4.19

   Quite a few fixes for the Renesas drivers in here, plus a fix for the
   Tegra driver and some documentation fixes for the recently added
   spi-mem code.  The Tegra fix is relatively large but fairly
   straightforward and mechanical, it runs on probe so it's been
   reasonably well covered in -next testing."

* tag 'spi-fix-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-mem: Move the DMA-able constraint doc to the kerneldoc header
  spi: spi-mem: Add missing description for data.nbytes field
  spi: rspi: Fix interrupted DMA transfers
  spi: rspi: Fix invalid SPI use during system suspend
  spi: sh-msiof: Fix handling of write value for SISTR register
  spi: sh-msiof: Fix invalid SPI use during system suspend
  spi: gpio: Fix copy-and-paste error
  spi: tegra20-slink: explicitly enable/disable clock

2f19e7a7

Merge tag 'regulator-v4.19-rc5' of... · 8f056611

Greg Kroah-Hartman authored Sep 28, 2018

Merge tag 'regulator-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Mark writes:
  "regulator: Fixes for 4.19

   A collection of fairly minor bug fixes here, a couple of driver
   specific ones plus two core fixes.  There's one fix for the new
   suspend state code which fixes some confusion with constant values
   that are supposed to indicate noop operation and another fixing a
   race condition with the creation of sysfs files on new regulators."

* tag 'regulator-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: fix crash caused by null driver data
  regulator: Fix 'do-nothing' value for regulators without suspend state
  regulator: da9063: fix DT probing with constraints
  regulator: bd71837: Disable voltage monitoring for LDO3/4

8f056611

Merge tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · f005de01

Greg Kroah-Hartman authored Sep 28, 2018

Michael writes:
  "powerpc fixes for 4.19 #3

   A reasonably big batch of fixes due to me being away for a few weeks.

   A fix for the TM emulation support on Power9, which could result in
   corrupting the guest r11 when running under KVM.

   Two fixes to the TM code which could lead to userspace GPR corruption
   if we take an SLB miss at exactly the wrong time.

   Our dynamic patching code had a bug that meant we could patch freed
   __init text, which could lead to corrupting userspace memory.

   csum_ipv6_magic() didn't work on little endian platforms since we
   optimised it recently.

   A fix for an endian bug when reading a device tree property telling
   us how many storage keys the machine has available.

   Fix a crash seen on some configurations of PowerVM when migrating the
   partition from one machine to another.

   A fix for a regression in the setup of our CPU to NUMA node mapping
   in KVM guests.

   A fix to our selftest Makefiles to make them work since a recent
   change to the shared Makefile logic."

* tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Fix Makefiles for headers_install change
  powerpc/numa: Use associativity if VPHN hcall is successful
  powerpc/tm: Avoid possible userspace r1 corruption on reclaim
  powerpc/tm: Fix userspace r13 corruption
  powerpc/pseries: Fix unitialized timer reset on migration
  powerpc/pkeys: Fix reading of ibm, processor-storage-keys property
  powerpc: fix csum_ipv6_magic() on little endian platforms
  powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again)
  powerpc: Avoid code patching freed init sections
  KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds

f005de01

Merge tag 'pinctrl-v4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 900915f9

Greg Kroah-Hartman authored Sep 28, 2018

Linus writes:
  "Pin control fixes for v4.19:
   - Fixes to x86 hardware:
   - AMD interrupt debounce issues
   - Faulty Intel cannonlake register offset
   - Revert pin translation IRQ locking"

* tag 'pinctrl-v4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  Revert "pinctrl: intel: Do pin translation when lock IRQ"
  pinctrl: cannonlake: Fix HOSTSW_OWN register offset of H variant
  pinctrl/amd: poll InterruptEnable bits in amd_gpio_irq_set_type

900915f9

28 Sep, 2018 6 commits

perf/core: Add sanity check to deal with pinned event failure · befb1b3c

Reinette Chatre authored Sep 19, 2018

It is possible that a failure can occur during the scheduling of a
pinned event. The initial portion of perf_event_read_local() contains
the various error checks an event should pass before it can be
considered valid. Ensure that the potential scheduling failure
of a pinned event is checked for and have a credible error.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: acme@kernel.org
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/6486385d1f30336e9973b24c8c65f5079543d3d3.1537377064.git.reinette.chatre@intel.com

befb1b3c

blk-iolatency: keep track of previous windows stats · 451bb7c3

Josef Bacik authored Sep 28, 2018

We apply a smoothing to the scale changes in order to keep sawtoothy
behavior from occurring.  However our window for checking if we've
missed our target can sometimes be lower than the smoothing interval
(500ms), especially on faster drives like ssd's.  In order to deal with
this keep track of the running tally of the previous intervals that we
threw away because we had already done a scale event recently.

This is needed for the ssd case as these low latency drives will have
bursts of latency, and if it happens to be ok for the window that
directly follows the opening of the scale window we could unthrottle
when previous windows we were missing our target.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

451bb7c3

blk-iolatency: use a percentile approache for ssd's · 1fa2840e

Josef Bacik authored Sep 28, 2018

We use an average latency approach for determining if we're missing our
latency target. This works well for rotational storage where we have
generally consistent latencies, but for ssd's and other low latency
devices you have more of a spikey behavior, which means we often won't
throttle misbehaving groups because a lot of IO completes at drastically
faster times than our latency target. Instead keep track of how many
IO's miss our target and how many IO's are done in our time window. If
the p(90) latency is above our target then we know we need to throttle.
With this change in place we are seeing the same throttling behavior
with our testcase on ssd's as we see with rotational drives.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

1fa2840e

blk-iolatency: deal with small samples · 22ed8a93

Josef Bacik authored Sep 28, 2018

There is logic to keep cgroups that haven't done a lot of IO in the most
recent scale window from being punished for over-active higher priority
groups. However for things like ssd's where the windows are pretty
short we'll end up with small numbers of samples, so 5% of samples will
come out to 0 if there aren't enough. Make the floor 1 sample to keep
us from improperly bailing out of scaling down.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

22ed8a93

blk-iolatency: deal with nr_requests == 1 · 9f60511a

Josef Bacik authored Sep 28, 2018

Hitting the case where blk_queue_depth() returned 1 uncovered the fact
that iolatency doesn't actually handle this case properly, it simply
doesn't scale down anybody. For this case we should go straight into
applying the time delay, which we weren't doing. Since we already limit
the floor at 1 request this if statement is not needed, and this allows
us to set our depth to 1 which allows us to apply the delay if needed.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

9f60511a

blk-iolatency: use q->nr_requests directly · ff4cee08

Josef Bacik authored Sep 28, 2018

We were using blk_queue_depth() assuming that it would return
nr_requests, but we hit a case in production on drives that had to have
NCQ turned off in order for them to not shit the bed which resulted in a
qd of 1, even though the nr_requests was much larger. iolatency really
only cares about requests we are allowed to queue up, as any io that
get's onto the request list is going to be serviced soonish, so we want
to be throttling before the bio gets onto the request list. To make
iolatency work as expected, simply use q->nr_requests instead of
blk_queue_depth() as that is what we actually care about.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ff4cee08