1. 30 Sep, 2019 3 commits
    • Linus Torvalds's avatar
      Merge branch 'entropy' · 3f2dc279
      Linus Torvalds authored
      Merge active entropy generation updates.
      
      This is admittedly partly "for discussion".  We need to have a way
      forward for the boot time deadlocks where user space ends up waiting for
      more entropy, but no entropy is forthcoming because the system is
      entirely idle just waiting for something to happen.
      
      While this was triggered by what is arguably a user space bug with
      GDM/gnome-session asking for secure randomness during early boot, when
      they didn't even need any such truly secure thing, the issue ends up
      being that our "getrandom()" interface is prone to that kind of
      confusion, because people don't think very hard about whether they want
      to block for sufficient amounts of entropy.
      
      The approach here-in is to decide to not just passively wait for entropy
      to happen, but to start actively collecting it if it is missing.  This
      is not necessarily always possible, but if the architecture has a CPU
      cycle counter, there is a fair amount of noise in the exact timings of
      reasonably complex loads.
      
      We may end up tweaking the load and the entropy estimates, but this
      should be at least a reasonable starting point.
      
      As part of this, we also revert the revert of the ext4 IO pattern
      improvement that ended up triggering the reported lack of external
      entropy.
      
      * getrandom() active entropy waiting:
        Revert "Revert "ext4: make __ext4_get_inode_loc plug""
        random: try to actively add entropy rather than passively wait for it
      3f2dc279
    • Linus Torvalds's avatar
      Revert "Revert "ext4: make __ext4_get_inode_loc plug"" · 02f03c42
      Linus Torvalds authored
      This reverts commit 72dbcf72.
      
      Instead of waiting forever for entropy that may just not happen, we now
      try to actively generate entropy when required, and are thus hopefully
      avoiding the problem that caused the nice ext4 IO pattern fix to be
      reverted.
      
      So revert the revert.
      
      Cc: Ahmed S. Darwish <darwish.07@gmail.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Alexander E. Patrakov <patrakov@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      02f03c42
    • Linus Torvalds's avatar
      random: try to actively add entropy rather than passively wait for it · 50ee7529
      Linus Torvalds authored
      For 5.3 we had to revert a nice ext4 IO pattern improvement, because it
      caused a bootup regression due to lack of entropy at bootup together
      with arguably broken user space that was asking for secure random
      numbers when it really didn't need to.
      
      See commit 72dbcf72 (Revert "ext4: make __ext4_get_inode_loc plug").
      
      This aims to solve the issue by actively generating entropy noise using
      the CPU cycle counter when waiting for the random number generator to
      initialize.  This only works when you have a high-frequency time stamp
      counter available, but that's the case on all modern x86 CPU's, and on
      most other modern CPU's too.
      
      What we do is to generate jitter entropy from the CPU cycle counter
      under a somewhat complex load: calling the scheduler while also
      guaranteeing a certain amount of timing noise by also triggering a
      timer.
      
      I'm sure we can tweak this, and that people will want to look at other
      alternatives, but there's been a number of papers written on jitter
      entropy, and this should really be fairly conservative by crediting one
      bit of entropy for every timer-induced jump in the cycle counter.  Not
      because the timer itself would be all that unpredictable, but because
      the interaction between the timer and the loop is going to be.
      
      Even if (and perhaps particularly if) the timer actually happens on
      another CPU, the cacheline interaction between the loop that reads the
      cycle counter and the timer itself firing is going to add perturbations
      to the cycle counter values that get mixed into the entropy pool.
      
      As Thomas pointed out, with a modern out-of-order CPU, even quite simple
      loops show a fair amount of hard-to-predict timing variability even in
      the absense of external interrupts.  But this tries to take that further
      by actually having a fairly complex interaction.
      
      This is not going to solve the entropy issue for architectures that have
      no CPU cycle counter, but it's not clear how (and if) that is solvable,
      and the hardware in question is largely starting to be irrelevant.  And
      by doing this we can at least avoid some of the even more contentious
      approaches (like making the entropy waiting time out in order to avoid
      the possibly unbounded waiting).
      
      Cc: Ahmed Darwish <darwish.07@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Nicholas Mc Guire <hofrat@opentech.at>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Alexander E. Patrakov <patrakov@gmail.com>
      Cc: Lennart Poettering <mzxreary@0pointer.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      50ee7529
  2. 29 Sep, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · a3c0e7b1
      Linus Torvalds authored
      More libnvdimm updates from Dan Williams:
      
       - Complete the reworks to interoperate with powerpc dynamic huge page
         sizes
      
       - Fix a crash due to missed accounting for the powerpc 'struct
         page'-memmap mapping granularity
      
       - Fix badblock initialization for volatile (DRAM emulated) pmem ranges
      
       - Stop triggering request_key() notifications to userspace when
         NVDIMM-security is disabled / not present
      
       - Miscellaneous small fixups
      
      * tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm/region: Enable MAP_SYNC for volatile regions
        libnvdimm: prevent nvdimm from requesting key when security is disabled
        libnvdimm/region: Initialize bad block for volatile namespaces
        libnvdimm/nfit_test: Fix acpi_handle redefinition
        libnvdimm/altmap: Track namespace boundaries in altmap
        libnvdimm: Fix endian conversion issues 
        libnvdimm/dax: Pick the right alignment default when creating dax devices
        powerpc/book3s64: Export has_transparent_hugepage() related functions.
      a3c0e7b1
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal · 939ca9f1
      Linus Torvalds authored
      Pull thermal SoC updates from Eduardo Valentin:
       "This is a really small pull in the midst of a lot of pending patches.
      
        We are in the middle of restructuring how we are maintaining the
        thermal subsystem, as per discussion in our last LPC. For now, I am
        sending just some changes that were pending in my tree. Looking
        forward to get a more streamlined process in the next merge window"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
        thermal: db8500: Rewrite to be a pure OF sensor
        thermal: db8500: Use dev helper variable
        thermal: db8500: Finalize device tree conversion
        thermal: thermal_mmio: remove some dead code
      939ca9f1
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 9ecb3e10
      Linus Torvalds authored
      Pull  more i2c updates from Wolfram Sang:
      
       - make Lenovo Yoga C630 boot now that the dependencies are merged
      
       - restore BlockProcessCall for i801, accidently removed in this merge
         window
      
       - a bugfix for the riic driver
      
       - an improvement to the slave-eeprom driver which should have been in
         the first pull request but sadly got lost in the process
      
      * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: slave-eeprom: Add read only mode
        i2c: i801: Bring back Block Process Call support for certain platforms
        i2c: riic: Clear NACK in tend isr
        i2c: qcom-geni: Disable DMA processing on the Lenovo Yoga C630
      9ecb3e10
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 4d2af08e
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
       "A couple of fixes for the AMD IOMMU driver have piled up:
      
         - Some fixes for the reworked IO page-table which caused memory leaks
           or did not allow to downgrade mappings under some conditions.
      
         - Locking fixes to fix a couple of possible races around accessing
           'struct protection_domain'. The races got introduced when the
           dma-ops path became lock-less in the fast-path"
      
      * tag 'iommu-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Lock code paths traversing protection_domain->dev_list
        iommu/amd: Lock dev_data in attach/detach code paths
        iommu/amd: Check for busy devices earlier in attach_device()
        iommu/amd: Take domain->lock for complete attach/detach path
        iommu/amd: Remove amd_iommu_devtable_lock
        iommu/amd: Remove domain->updated
        iommu/amd: Wait for completion of IOTLB flush in attach_device
        iommu/amd: Unmap all L7 PTEs when downgrading page-sizes
        iommu/amd: Introduce first_pte_l7() helper
        iommu/amd: Fix downgrading default page-sizes in alloc_pte()
        iommu/amd: Fix pages leak in free_pagetable()
      4d2af08e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 02dc96ef
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Sanity check URB networking device parameters to avoid divide by
          zero, from Oliver Neukum.
      
       2) Disable global multicast filter in NCSI, otherwise LLDP and IPV6
          don't work properly. Longer term this needs a better fix tho. From
          Vijay Khemka.
      
       3) Small fixes to selftests (use ping when ping6 is not present, etc.)
          from David Ahern.
      
       4) Bring back rt_uses_gateway member of struct rtable, it's semantics
          were not well understood and trying to remove it broke things. From
          David Ahern.
      
       5) Move usbnet snaity checking, ignore endpoints with invalid
          wMaxPacketSize. From Bjørn Mork.
      
       6) Missing Kconfig deps for sja1105 driver, from Mao Wenan.
      
       7) Various small fixes to the mlx5 DR steering code, from Alaa Hleihel,
          Alex Vesker, and Yevgeny Kliteynik
      
       8) Missing CAP_NET_RAW checks in various places, from Ori Nimron.
      
       9) Fix crash when removing sch_cbs entry while offloading is enabled,
          from Vinicius Costa Gomes.
      
      10) Signedness bug fixes, generally in looking at the result given by
          of_get_phy_mode() and friends. From Dan Crapenter.
      
      11) Disable preemption around BPF_PROG_RUN() calls, from Eric Dumazet.
      
      12) Don't create VRF ipv6 rules if ipv6 is disabled, from David Ahern.
      
      13) Fix quantization code in tcp_bbr, from Kevin Yang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (127 commits)
        net: tap: clean up an indentation issue
        nfp: abm: fix memory leak in nfp_abm_u32_knode_replace
        tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state
        sk_buff: drop all skb extensions on free and skb scrubbing
        tcp_bbr: fix quantization code to not raise cwnd if not probing bandwidth
        mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actions
        Documentation: Clarify trap's description
        mlxsw: spectrum: Clear VLAN filters during port initialization
        net: ena: clean up indentation issue
        NFC: st95hf: clean up indentation issue
        net: phy: micrel: add Asym Pause workaround for KSZ9021
        net: socionext: ave: Avoid using netdev_err() before calling register_netdev()
        ptp: correctly disable flags on old ioctls
        lib: dimlib: fix help text typos
        net: dsa: microchip: Always set regmap stride to 1
        nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs
        nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs
        net/sched: Set default of CONFIG_NET_TC_SKB_EXT to N
        vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled
        net: sched: sch_sfb: don't call qdisc_put() while holding tree lock
        ...
      02dc96ef
  3. 28 Sep, 2019 21 commits
    • Linus Torvalds's avatar
      Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes) · edf445ad
      Linus Torvalds authored
      Merge hugepage allocation updates from David Rientjes:
       "We (mostly Linus, Andrea, and myself) have been discussing offlist how
        to implement a sane default allocation strategy for hugepages on NUMA
        platforms.
      
        With these reverts in place, the page allocator will happily allocate
        a remote hugepage immediately rather than try to make a local hugepage
        available. This incurs a substantial performance degradation when
        memory compaction would have otherwise made a local hugepage
        available.
      
        This series reverts those reverts and attempts to propose a more sane
        default allocation strategy specifically for hugepages. Andrea
        acknowledges this is likely to fix the swap storms that he originally
        reported that resulted in the patches that removed __GFP_THISNODE from
        hugepage allocations.
      
        The immediate goal is to return 5.3 to the behavior the kernel has
        implemented over the past several years so that remote hugepages are
        not immediately allocated when local hugepages could have been made
        available because the increased access latency is untenable.
      
        The next goal is to introduce a sane default allocation strategy for
        hugepages allocations in general regardless of the configuration of
        the system so that we prevent thrashing of local memory when
        compaction is unlikely to succeed and can prefer remote hugepages over
        remote native pages when the local node is low on memory."
      
      Note on timing: this reverts the hugepage VM behavior changes that got
      introduced fairly late in the 5.3 cycle, and that fixed a huge
      performance regression for certain loads that had been around since
      4.18.
      
      Andrea had this note:
      
       "The regression of 4.18 was that it was taking hours to start a VM
        where 3.10 was only taking a few seconds, I reported all the details
        on lkml when it was finally tracked down in August 2018.
      
           https://lore.kernel.org/linux-mm/20180820032640.9896-2-aarcange@redhat.com/
      
        __GFP_THISNODE in MADV_HUGEPAGE made the above enterprise vfio
        workload degrade like in the "current upstream" above. And it still
        would have been that bad as above until 5.3-rc5"
      
      where the bad behavior ends up happening as you fill up a local node,
      and without that change, you'd get into the nasty swap storm behavior
      due to compaction working overtime to make room for more memory on the
      nodes.
      
      As a result 5.3 got the two performance fix reverts in rc5.
      
      However, David Rientjes then noted that those performance fixes in turn
      regressed performance for other loads - although not quite to the same
      degree.  He suggested reverting the reverts and instead replacing them
      with two small changes to how hugepage allocations are done (patch
      descriptions rephrased by me):
      
       - "avoid expensive reclaim when compaction may not succeed": just admit
         that the allocation failed when you're trying to allocate a huge-page
         and compaction wasn't successful.
      
       - "allow hugepage fallback to remote nodes when madvised": when that
         node-local huge-page allocation failed, retry without forcing the
         local node.
      
      but by then I judged it too late to replace the fixes for a 5.3 release.
      So 5.3 was released with behavior that harked back to the pre-4.18 logic.
      
      But now we're in the merge window for 5.4, and we can see if this
      alternate model fixes not just the horrendous swap storm behavior, but
      also restores the performance regression that the late reverts caused.
      
      Fingers crossed.
      
      * emailed patches from David Rientjes <rientjes@google.com>:
        mm, page_alloc: allow hugepage fallback to remote nodes when madvised
        mm, page_alloc: avoid expensive reclaim when compaction may not succeed
        Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask""
        Revert "Revert "mm, thp: restore node-local hugepage allocations""
      edf445ad
    • David Rientjes's avatar
      mm, page_alloc: allow hugepage fallback to remote nodes when madvised · 76e654cc
      David Rientjes authored
      For systems configured to always try hard to allocate transparent
      hugepages (thp defrag setting of "always") or for memory that has been
      explicitly madvised to MADV_HUGEPAGE, it is often better to fallback to
      remote memory to allocate the hugepage if the local allocation fails
      first.
      
      The point is to allow the initial call to __alloc_pages_node() to attempt
      to defragment local memory to make a hugepage available, if possible,
      rather than immediately fallback to remote memory.  Local hugepages will
      always have a better access latency than remote (huge)pages, so an attempt
      to make a hugepage available locally is always preferred.
      
      If memory compaction cannot be successful locally, however, it is likely
      better to fallback to remote memory.  This could take on two forms: either
      allow immediate fallback to remote memory or do per-zone watermark checks.
      It would be possible to fallback only when per-zone watermarks fail for
      order-0 memory, since that would require local reclaim for all subsequent
      faults so remote huge allocation is likely better than thrashing the local
      zone for large workloads.
      
      In this case, it is assumed that because the system is configured to try
      hard to allocate hugepages or the vma is advised to explicitly want to try
      hard for hugepages that remote allocation is better when local allocation
      and memory compaction have both failed.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      76e654cc
    • David Rientjes's avatar
      mm, page_alloc: avoid expensive reclaim when compaction may not succeed · b39d0ee2
      David Rientjes authored
      Memory compaction has a couple significant drawbacks as the allocation
      order increases, specifically:
      
       - isolate_freepages() is responsible for finding free pages to use as
         migration targets and is implemented as a linear scan of memory
         starting at the end of a zone,
      
       - failing order-0 watermark checks in memory compaction does not account
         for how far below the watermarks the zone actually is: to enable
         migration, there must be *some* free memory available.  Per the above,
         watermarks are not always suffficient if isolate_freepages() cannot
         find the free memory but it could require hundreds of MBs of reclaim to
         even reach this threshold (read: potentially very expensive reclaim with
         no indication compaction can be successful), and
      
       - if compaction at this order has failed recently so that it does not even
         run as a result of deferred compaction, looping through reclaim can often
         be pointless.
      
      For hugepage allocations, these are quite substantial drawbacks because
      these are very high order allocations (order-9 on x86) and falling back to
      doing reclaim can potentially be *very* expensive without any indication
      that compaction would even be successful.
      
      Reclaim itself is unlikely to free entire pageblocks and certainly no
      reliance should be put on it to do so in isolation (recall lumpy reclaim).
      This means we should avoid reclaim and simply fail hugepage allocation if
      compaction is deferred.
      
      It is also not helpful to thrash a zone by doing excessive reclaim if
      compaction may not be able to access that memory.  If order-0 watermarks
      fail and the allocation order is sufficiently large, it is likely better
      to fail the allocation rather than thrashing the zone.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b39d0ee2
    • David Rientjes's avatar
      Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"" · 19deb769
      David Rientjes authored
      This reverts commit 92717d42.
      
      Since commit a8282608 ("Revert "mm, thp: restore node-local hugepage
      allocations"") is reverted in this series, it is better to restore the
      previous 5.2 behavior between the thp allocation and the page allocator
      rather than to attempt any consolidation or cleanup for a policy that is
      now reverted.  It's less risky during an rc cycle and subsequent patches
      in this series further modify the same policy that the pre-5.3 behavior
      implements.
      
      Consolidation and cleanup can be done subsequent to a sane default page
      allocation strategy, so this patch reverts a cleanup done on a strategy
      that is now reverted and thus is the least risky option.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19deb769
    • David Rientjes's avatar
      Revert "Revert "mm, thp: restore node-local hugepage allocations"" · ac79f78d
      David Rientjes authored
      This reverts commit a8282608.
      
      The commit references the original intended semantic for MADV_HUGEPAGE
      which has subsequently taken on three unique purposes:
      
       - enables or disables thp for a range of memory depending on the system's
         config (is thp "enabled" set to "always" or "madvise"),
      
       - determines the synchronous compaction behavior for thp allocations at
         fault (is thp "defrag" set to "always", "defer+madvise", or "madvise"),
         and
      
       - reverts a previous MADV_NOHUGEPAGE (there is no madvise mode to only
         clear previous hugepage advice).
      
      These are the three purposes that currently exist in 5.2 and over the
      past several years that userspace has been written around.  Adding a
      NUMA locality preference adds a fourth dimension to an already conflated
      advice mode.
      
      Based on the semantic that MADV_HUGEPAGE has provided over the past
      several years, there exist workloads that use the tunable based on these
      principles: specifically that the allocation should attempt to
      defragment a local node before falling back.  It is agreed that remote
      hugepages typically (but not always) have a better access latency than
      remote native pages, although on Naples this is at parity for
      intersocket.
      
      The revert commit that this patch reverts allows hugepage allocation to
      immediately allocate remotely when local memory is fragmented.  This is
      contrary to the semantic of MADV_HUGEPAGE over the past several years:
      that is, memory compaction should be attempted locally before falling
      back.
      
      The performance degradation of remote hugepages over local hugepages on
      Rome, for example, is 53.5% increased access latency.  For this reason,
      the goal is to revert back to the 5.2 and previous behavior that would
      attempt local defragmentation before falling back.  With the patch that
      is reverted by this patch, we see performance degradations at the tail
      because the allocator happily allocates the remote hugepage rather than
      even attempting to make a local hugepage available.
      
      zone_reclaim_mode is not a solution to this problem since it does not
      only impact hugepage allocations but rather changes the memory
      allocation strategy for *all* page allocations.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ac79f78d
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · a2953204
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "An assortment of fixes that were either missed by me, or didn't arrive
        quite in time for the first v5.4 pull.
      
         - Most notable is a fix for an issue with tlbie (broadcast TLB
           invalidation) on Power9, when using the Radix MMU. The tlbie can
           race with an mtpid (move to PID register, essentially MMU context
           switch) on another thread of the core, which can cause stores to
           continue to go to a page after it's unmapped.
      
         - A fix in our KVM code to add a missing barrier, the lack of which
           has been observed to cause missed IPIs and subsequently stuck CPUs
           in the host.
      
         - A change to the way we initialise PCR (Processor Compatibility
           Register) to make it forward compatible with future CPUs.
      
         - On some older PowerVM systems our H_BLOCK_REMOVE support could
           oops, fix it to detect such systems and fallback to the old
           invalidation method.
      
         - A fix for an oops seen on some machines when using KASAN on 32-bit.
      
         - A handful of other minor fixes, and two new selftests.
      
        Thanks to: Alistair Popple, Aneesh Kumar K.V, Christophe Leroy,
        Gustavo Romero, Joel Stanley, Jordan Niethe, Laurent Dufour, Michael
        Roth, Oliver O'Halloran"
      
      * tag 'powerpc-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/eeh: Fix eeh eeh_debugfs_break_device() with SRIOV devices
        powerpc/nvdimm: use H_SCM_QUERY hcall on H_OVERLAP error
        powerpc/nvdimm: Use HCALL error as the return value
        selftests/powerpc: Add test case for tlbie vs mtpidr ordering issue
        powerpc/mm: Fixup tlbie vs mtpidr/mtlpidr ordering issue on POWER9
        powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flag
        powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisions
        powerpc/pseries: Call H_BLOCK_REMOVE when supported
        powerpc/pseries: Read TLB Block Invalidate Characteristics
        KVM: PPC: Book3S HV: use smp_mb() when setting/clearing host_ipi flag
        powerpc/mm: Fix an Oops in kasan_mmu_init()
        powerpc/mm: Add a helper to select PAGE_KERNEL_RO or PAGE_READONLY
        powerpc/64s: Set reserved PCR bits
        powerpc: Fix definition of PCR bits to work with old binutils
        powerpc/book3s64/radix: Remove WARN_ON in destroy_context()
        powerpc/tm: Add tm-poison test
      a2953204
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f19e00ee
      Linus Torvalds authored
      Pull x86 fix from Ingo Molnar:
       "A kexec fix for the case when GCC_PLUGIN_STACKLEAK=y is enabled"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/purgatory: Disable the stackleak GCC plugin for the purgatory
      f19e00ee
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9c5efe9a
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
      
       - Apply a number of membarrier related fixes and cleanups, which fixes
         a use-after-free race in the membarrier code
      
       - Introduce proper RCU protection for tasks on the runqueue - to get
         rid of the subtle task_rcu_dereference() interface that was easy to
         get wrong
      
       - Misc fixes, but also an EAS speedup
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Avoid redundant EAS calculation
        sched/core: Remove double update_max_interval() call on CPU startup
        sched/core: Fix preempt_schedule() interrupt return comment
        sched/fair: Fix -Wunused-but-set-variable warnings
        sched/core: Fix migration to invalid CPU in __set_cpus_allowed_ptr()
        sched/membarrier: Return -ENOMEM to userspace on memory allocation failure
        sched/membarrier: Skip IPIs when mm->mm_users == 1
        selftests, sched/membarrier: Add multi-threaded test
        sched/membarrier: Fix p->mm->membarrier_state racy load
        sched/membarrier: Call sync_core only before usermode for same mm
        sched/membarrier: Remove redundant check
        sched/membarrier: Fix private expedited registration check
        tasks, sched/core: RCUify the assignment of rq->curr
        tasks, sched/core: With a grace period after finish_task_switch(), remove unnecessary code
        tasks, sched/core: Ensure tasks are available for a grace period after leaving the runqueue
        tasks: Add a count of task RCU users
        sched/core: Convert vcpu_is_preempted() from macro to an inline function
        sched/fair: Remove unused cfs_rq_clock_task() function
      9c5efe9a
    • Björn Ardö's avatar
      i2c: slave-eeprom: Add read only mode · 11af27f4
      Björn Ardö authored
      Add read-only versions of all EEPROMs. These versions are read-only
      on the i2c side, but can be written from the sysfs side.
      Signed-off-by: default avatarBjörn Ardö <bjorn.ardo@axis.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      11af27f4
    • Jarkko Nikula's avatar
      i2c: i801: Bring back Block Process Call support for certain platforms · fd4b204a
      Jarkko Nikula authored
      Commit b84398d6 ("i2c: i801: Use iTCO version 6 in Cannon Lake PCH
      and beyond") looks like to drop by accident Block Write-Block Read Process
      Call support for Intel Sunrisepoint, Lewisburg, Denverton and Kaby Lake.
      
      That support was added for above and newer platforms by the commit
      315cd67c ("i2c: i801: Add Block Write-Block Read Process Call
      support") so bring it back for above platforms.
      
      Fixes: b84398d6 ("i2c: i801: Use iTCO version 6 in Cannon Lake PCH and beyond")
      Signed-off-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Reviewed-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      fd4b204a
    • Chris Brandt's avatar
      i2c: riic: Clear NACK in tend isr · a71e2ac1
      Chris Brandt authored
      The NACKF flag should be cleared in INTRIICNAKI interrupt processing as
      description in HW manual.
      
      This issue shows up quickly when PREEMPT_RT is applied and a device is
      probed that is not plugged in (like a touchscreen controller). The result
      is endless interrupts that halt system boot.
      
      Fixes: 310c18a4 ("i2c: riic: add driver")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarChien Nguyen <chien.nguyen.eb@rvc.renesas.com>
      Signed-off-by: default avatarChris Brandt <chris.brandt@renesas.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      a71e2ac1
    • Lee Jones's avatar
      i2c: qcom-geni: Disable DMA processing on the Lenovo Yoga C630 · 127068ab
      Lee Jones authored
      We have a production-level laptop (Lenovo Yoga C630) which is exhibiting
      a rather horrific bug.  When I2C HID devices are being scanned for at
      boot-time the QCom Geni based I2C (Serial Engine) attempts to use DMA.
      When it does, the laptop reboots and the user never sees the OS.
      
      Attempts are being made to debug the reason for the spontaneous reboot.
      No luck so far, hence the requirement for this hot-fix.  This workaround
      will be removed once we have a viable fix.
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Tested-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      127068ab
    • Linus Torvalds's avatar
      Merge branch 'next-lockdown' of... · aefcf2f4
      Linus Torvalds authored
      Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull kernel lockdown mode from James Morris:
       "This is the latest iteration of the kernel lockdown patchset, from
        Matthew Garrett, David Howells and others.
      
        From the original description:
      
          This patchset introduces an optional kernel lockdown feature,
          intended to strengthen the boundary between UID 0 and the kernel.
          When enabled, various pieces of kernel functionality are restricted.
          Applications that rely on low-level access to either hardware or the
          kernel may cease working as a result - therefore this should not be
          enabled without appropriate evaluation beforehand.
      
          The majority of mainstream distributions have been carrying variants
          of this patchset for many years now, so there's value in providing a
          doesn't meet every distribution requirement, but gets us much closer
          to not requiring external patches.
      
        There are two major changes since this was last proposed for mainline:
      
         - Separating lockdown from EFI secure boot. Background discussion is
           covered here: https://lwn.net/Articles/751061/
      
         -  Implementation as an LSM, with a default stackable lockdown LSM
            module. This allows the lockdown feature to be policy-driven,
            rather than encoding an implicit policy within the mechanism.
      
        The new locked_down LSM hook is provided to allow LSMs to make a
        policy decision around whether kernel functionality that would allow
        tampering with or examining the runtime state of the kernel should be
        permitted.
      
        The included lockdown LSM provides an implementation with a simple
        policy intended for general purpose use. This policy provides a coarse
        level of granularity, controllable via the kernel command line:
      
          lockdown={integrity|confidentiality}
      
        Enable the kernel lockdown feature. If set to integrity, kernel features
        that allow userland to modify the running kernel are disabled. If set to
        confidentiality, kernel features that allow userland to extract
        confidential information from the kernel are also disabled.
      
        This may also be controlled via /sys/kernel/security/lockdown and
        overriden by kernel configuration.
      
        New or existing LSMs may implement finer-grained controls of the
        lockdown features. Refer to the lockdown_reason documentation in
        include/linux/security.h for details.
      
        The lockdown feature has had signficant design feedback and review
        across many subsystems. This code has been in linux-next for some
        weeks, with a few fixes applied along the way.
      
        Stephen Rothwell noted that commit 9d1f8be5 ("bpf: Restrict bpf
        when kernel lockdown is in confidentiality mode") is missing a
        Signed-off-by from its author. Matthew responded that he is providing
        this under category (c) of the DCO"
      
      * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits)
        kexec: Fix file verification on S390
        security: constify some arrays in lockdown LSM
        lockdown: Print current->comm in restriction messages
        efi: Restrict efivar_ssdt_load when the kernel is locked down
        tracefs: Restrict tracefs when the kernel is locked down
        debugfs: Restrict debugfs when the kernel is locked down
        kexec: Allow kexec_file() with appropriate IMA policy when locked down
        lockdown: Lock down perf when in confidentiality mode
        bpf: Restrict bpf when kernel lockdown is in confidentiality mode
        lockdown: Lock down tracing and perf kprobes when in confidentiality mode
        lockdown: Lock down /proc/kcore
        x86/mmiotrace: Lock down the testmmiotrace module
        lockdown: Lock down module params that specify hardware parameters (eg. ioport)
        lockdown: Lock down TIOCSSERIAL
        lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down
        acpi: Disable ACPI table override if the kernel is locked down
        acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down
        ACPI: Limit access to custom_method when the kernel is locked down
        x86/msr: Restrict MSR access when the kernel is locked down
        x86: Lock down IO port access when the kernel is locked down
        ...
      aefcf2f4
    • Joerg Roedel's avatar
      iommu/amd: Lock code paths traversing protection_domain->dev_list · 2a78f996
      Joerg Roedel authored
      The traversing of this list requires protection_domain->lock to be taken
      to avoid nasty races with attach/detach code. Make sure the lock is held
      on all code-paths traversing this list.
      Reported-by: default avatarFilippo Sironi <sironi@amazon.de>
      Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
      Reviewed-by: default avatarFilippo Sironi <sironi@amazon.de>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      2a78f996
    • Joerg Roedel's avatar
      iommu/amd: Lock dev_data in attach/detach code paths · ab7b2577
      Joerg Roedel authored
      Make sure that attaching a detaching a device can't race against each
      other and protect the iommu_dev_data with a spin_lock in these code
      paths.
      
      Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
      Reviewed-by: default avatarFilippo Sironi <sironi@amazon.de>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      ab7b2577
    • Joerg Roedel's avatar
      iommu/amd: Check for busy devices earlier in attach_device() · 45e528d9
      Joerg Roedel authored
      Check early in attach_device whether the device is already attached to a
      domain. This also simplifies the code path so that __attach_device() can
      be removed.
      
      Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
      Reviewed-by: default avatarFilippo Sironi <sironi@amazon.de>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      45e528d9
    • Joerg Roedel's avatar
      iommu/amd: Take domain->lock for complete attach/detach path · f6c0bfce
      Joerg Roedel authored
      The code-paths before __attach_device() and __detach_device() are called
      also access and modify domain state, so take the domain lock there too.
      This allows to get rid of the __detach_device() function.
      
      Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
      Reviewed-by: default avatarFilippo Sironi <sironi@amazon.de>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      f6c0bfce
    • Joerg Roedel's avatar
      iommu/amd: Remove amd_iommu_devtable_lock · 3a11905b
      Joerg Roedel authored
      The lock is not necessary because the device table does not
      contain shared state that needs protection. Locking is only
      needed on an individual entry basis, and that needs to
      happen on the iommu_dev_data level.
      
      Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
      Reviewed-by: default avatarFilippo Sironi <sironi@amazon.de>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      3a11905b
    • Joerg Roedel's avatar
      iommu/amd: Remove domain->updated · f15d9a99
      Joerg Roedel authored
      This struct member was used to track whether a domain
      change requires updates to the device-table and IOMMU cache
      flushes. The problem is, that access to this field is racy
      since locking in the common mapping code-paths has been
      eliminated.
      
      Move the updated field to the stack to get rid of all
      potential races and remove the field from the struct.
      
      Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
      Reviewed-by: default avatarFilippo Sironi <sironi@amazon.de>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      f15d9a99
    • Linus Torvalds's avatar
      Merge branch 'next-integrity' of... · f1f2f614
      Linus Torvalds authored
      Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
      
      Pull integrity updates from Mimi Zohar:
       "The major feature in this time is IMA support for measuring and
        appraising appended file signatures. In addition are a couple of bug
        fixes and code cleanup to use struct_size().
      
        In addition to the PE/COFF and IMA xattr signatures, the kexec kernel
        image may be signed with an appended signature, using the same
        scripts/sign-file tool that is used to sign kernel modules.
      
        Similarly, the initramfs may contain an appended signature.
      
        This contained a lot of refactoring of the existing appended signature
        verification code, so that IMA could retain the existing framework of
        calculating the file hash once, storing it in the IMA measurement list
        and extending the TPM, verifying the file's integrity based on a file
        hash or signature (eg. xattrs), and adding an audit record containing
        the file hash, all based on policy. (The IMA support for appended
        signatures patch set was posted and reviewed 11 times.)
      
        The support for appended signature paves the way for adding other
        signature verification methods, such as fs-verity, based on a single
        system-wide policy. The file hash used for verifying the signature and
        the signature, itself, can be included in the IMA measurement list"
      
      * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
        ima: ima_api: Use struct_size() in kzalloc()
        ima: use struct_size() in kzalloc()
        sefltest/ima: support appended signatures (modsig)
        ima: Fix use after free in ima_read_modsig()
        MODSIGN: make new include file self contained
        ima: fix freeing ongoing ahash_request
        ima: always return negative code for error
        ima: Store the measurement again when appraising a modsig
        ima: Define ima-modsig template
        ima: Collect modsig
        ima: Implement support for module-style appended signatures
        ima: Factor xattr_verify() out of ima_appraise_measurement()
        ima: Add modsig appraise_type option for module-style appended signatures
        integrity: Select CONFIG_KEYS instead of depending on it
        PKCS#7: Introduce pkcs7_get_digest()
        PKCS#7: Refactor verify_pkcs7_signature()
        MODSIGN: Export module signature definitions
        ima: initialize the "template" field with the default template
      f1f2f614
    • Linus Torvalds's avatar
      Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux · 298fb76a
      Linus Torvalds authored
      Pull nfsd updates from Bruce Fields:
       "Highlights:
      
         - Add a new knfsd file cache, so that we don't have to open and close
           on each (NFSv2/v3) READ or WRITE. This can speed up read and write
           in some cases. It also replaces our readahead cache.
      
         - Prevent silent data loss on write errors, by treating write errors
           like server reboots for the purposes of write caching, thus forcing
           clients to resend their writes.
      
         - Tweak the code that allocates sessions to be more forgiving, so
           that NFSv4.1 mounts are less likely to hang when a server already
           has a lot of clients.
      
         - Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be
           limited only by the backend filesystem and the maximum RPC size.
      
         - Allow the server to enforce use of the correct kerberos credentials
           when a client reclaims state after a reboot.
      
        And some miscellaneous smaller bugfixes and cleanup"
      
      * tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits)
        sunrpc: clean up indentation issue
        nfsd: fix nfs read eof detection
        nfsd: Make nfsd_reset_boot_verifier_locked static
        nfsd: degraded slot-count more gracefully as allocation nears exhaustion.
        nfsd: handle drc over-allocation gracefully.
        nfsd: add support for upcall version 2
        nfsd: add a "GetVersion" upcall for nfsdcld
        nfsd: Reset the boot verifier on all write I/O errors
        nfsd: Don't garbage collect files that might contain write errors
        nfsd: Support the server resetting the boot verifier
        nfsd: nfsd_file cache entries should be per net namespace
        nfsd: eliminate an unnecessary acl size limit
        Deprecate nfsd fault injection
        nfsd: remove duplicated include from filecache.c
        nfsd: Fix the documentation for svcxdr_tmpalloc()
        nfsd: Fix up some unused variable warnings
        nfsd: close cached files prior to a REMOVE or RENAME that would replace target
        nfsd: rip out the raparms cache
        nfsd: have nfsd_test_lock use the nfsd_file cache
        nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
        ...
      298fb76a
  4. 27 Sep, 2019 11 commits
    • Linus Torvalds's avatar
      Merge tag 'virtio-fs-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 8f744bde
      Linus Torvalds authored
      Pull fuse virtio-fs support from Miklos Szeredi:
       "Virtio-fs allows exporting directory trees on the host and mounting
        them in guest(s).
      
        This isn't actually a new filesystem, but a glue layer between the
        fuse filesystem and a virtio based back-end.
      
        It's similar in functionality to the existing virtio-9p solution, but
        significantly faster in benchmarks and has better POSIX compliance.
        Further permformance improvements can be achieved by sharing the page
        cache between host and guest, allowing for faster I/O and reduced
        memory use.
      
        Kata Containers have been including the out-of-tree virtio-fs (with
        the shared page cache patches as well) since version 1.7 as an
        experimental feature. They have been active in development and plan to
        switch from virtio-9p to virtio-fs as their default solution. There
        has been interest from other sources as well.
      
        The userspace infrastructure is slated to be merged into qemu once the
        kernel part hits mainline.
      
        This was developed by Vivek Goyal, Dave Gilbert and Stefan Hajnoczi"
      
      * tag 'virtio-fs-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        virtio-fs: add virtiofs filesystem
        virtio-fs: add Documentation/filesystems/virtiofs.rst
        fuse: reserve values for mapping protocol
      8f744bde
    • Linus Torvalds's avatar
      Merge tag '9p-for-5.4' of git://github.com/martinetd/linux · 9977b1a7
      Linus Torvalds authored
      Pull 9p updates from Dominique Martinet:
       "Some of the usual small fixes and cleanup.
      
        Small fixes all around:
         - avoid overlayfs copy-up for PRIVATE mmaps
         - KUMSAN uninitialized warning for transport error
         - one syzbot memory leak fix in 9p cache
         - internal API cleanup for v9fs_fill_super"
      
      * tag '9p-for-5.4' of git://github.com/martinetd/linux:
        9p/vfs_super.c: Remove unused parameter data in v9fs_fill_super
        9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie
        9p: Transport error uninitialized
        9p: avoid attaching writeback_fid on mmap with type PRIVATE
      9977b1a7
    • Linus Torvalds's avatar
      Merge tag 'riscv/for-v5.4-rc1-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 568d850e
      Linus Torvalds authored
      Pull more RISC-V updates from Paul Walmsley:
       "Some additional RISC-V updates.
      
        This includes one significant fix:
      
         - Prevent interrupts from being unconditionally re-enabled during
           exception handling if they were disabled in the context in which
           the exception occurred
      
        Also a few other fixes:
      
         - Fix a build error when sparse memory support is manually enabled
      
         - Prevent CPUs beyond CONFIG_NR_CPUS from being enabled in early boot
      
        And a few minor improvements:
      
         - DT improvements: in the FU540 SoC DT files, improve U-Boot
           compatibility by adding an "ethernet0" alias, drop an unnecessary
           property from the DT files, and add support for the PWM device
      
         - KVM preparation: add a KVM-related macro for future RISC-V KVM
           support, and export some symbols required to build KVM support as
           modules
      
         - defconfig additions: build more drivers by default for QEMU
           configurations"
      
      * tag 'riscv/for-v5.4-rc1-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Avoid interrupts being erroneously enabled in handle_exception()
        riscv: dts: sifive: Drop "clock-frequency" property of cpu nodes
        riscv: dts: sifive: Add ethernet0 to the aliases node
        RISC-V: Export kernel symbols for kvm
        KVM: RISC-V: Add KVM_REG_RISCV for ONE_REG interface
        arch/riscv: disable excess harts before picking main boot hart
        RISC-V: Enable VIRTIO drivers in RV64 and RV32 defconfig
        RISC-V: Fix building error when CONFIG_SPARSEMEM_MANUAL=y
        riscv: dts: Add DT support for SiFive FU540 PWM driver
      568d850e
    • Linus Torvalds's avatar
      Merge tag 'nios2-v5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 70570a64
      Linus Torvalds authored
      Pull nios2 fix from Ley Foon Tan:
       "Make sure the command line buffer is NUL-terminated"
      
      * tag 'nios2-v5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: force the string buffer NULL-terminated
      70570a64
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 8bbe0dec
      Linus Torvalds authored
      Pull more KVM updates from Paolo Bonzini:
       "x86 KVM changes:
      
         - The usual accuracy improvements for nested virtualization
      
         - The usual round of code cleanups from Sean
      
         - Added back optimizations that were prematurely removed in 5.2 (the
           bare minimum needed to fix the regression was in 5.3-rc8, here
           comes the rest)
      
         - Support for UMWAIT/UMONITOR/TPAUSE
      
         - Direct L2->L0 TLB flushing when L0 is Hyper-V and L1 is KVM
      
         - Tell Windows guests if SMT is disabled on the host
      
         - More accurate detection of vmexit cost
      
         - Revert a pvqspinlock pessimization"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (56 commits)
        KVM: nVMX: cleanup and fix host 64-bit mode checks
        KVM: vmx: fix build warnings in hv_enable_direct_tlbflush() on i386
        KVM: x86: Don't check kvm_rebooting in __kvm_handle_fault_on_reboot()
        KVM: x86: Drop ____kvm_handle_fault_on_reboot()
        KVM: VMX: Add error handling to VMREAD helper
        KVM: VMX: Optimize VMX instruction error and fault handling
        KVM: x86: Check kvm_rebooting in kvm_spurious_fault()
        KVM: selftests: fix ucall on x86
        Revert "locking/pvqspinlock: Don't wait if vCPU is preempted"
        kvm: nvmx: limit atomic switch MSRs
        kvm: svm: Intercept RDPRU
        kvm: x86: Add "significant index" flag to a few CPUID leaves
        KVM: x86/mmu: Skip invalid pages during zapping iff root_count is zero
        KVM: x86/mmu: Explicitly track only a single invalid mmu generation
        KVM: x86/mmu: Revert "KVM: x86/mmu: Remove is_obsolete() call"
        KVM: x86/mmu: Revert "Revert "KVM: MMU: reclaim the zapped-obsolete page first""
        KVM: x86/mmu: Revert "Revert "KVM: MMU: collapse TLB flushes when zap all pages""
        KVM: x86/mmu: Revert "Revert "KVM: MMU: zap pages in batch""
        KVM: x86/mmu: Revert "Revert "KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages""
        KVM: x86/mmu: Revert "Revert "KVM: MMU: show mmu_valid_gen in shadow page related tracepoints""
        ...
      8bbe0dec
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-5.4-rc1' of... · e37e3bc7
      Linus Torvalds authored
      Merge tag 'pwm/for-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "Besides one new driver being added for the PWM controller found in
        various Spreadtrum SoCs, this series of changes brings a slew of,
        mostly minor, fixes and cleanups for existing drivers, as well as some
        enhancements to the core code.
      
        Lastly, Uwe is added to the PWM subsystem entry of the MAINTAINERS
        file, making official his role as a reviewer"
      
      * tag 'pwm/for-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (34 commits)
        MAINTAINERS: Add myself as reviewer for the PWM subsystem
        MAINTAINERS: Add patchwork link for PWM entry
        MAINTAINERS: Add a selection of PWM related keywords to the PWM entry
        pwm: mediatek: Add MT7629 compatible string
        dt-bindings: pwm: Update bindings for MT7629 SoC
        pwm: mediatek: Update license and switch to SPDX tag
        pwm: mediatek: Use pwm_mediatek as common prefix
        pwm: mediatek: Allocate the clks array dynamically
        pwm: mediatek: Remove the has_clks field
        pwm: mediatek: Drop the check for of_device_get_match_data()
        pwm: atmel: Consolidate driver data initialization
        pwm: atmel: Remove unneeded check for match data
        pwm: atmel: Remove platform_device_id and use only dt bindings
        pwm: stm32-lp: Add check in case requested period cannot be achieved
        pwm: Ensure pwm_apply_state() doesn't modify the state argument
        pwm: fsl-ftm: Don't update the state for the caller of pwm_apply_state()
        pwm: sun4i: Don't update the state for the caller of pwm_apply_state()
        pwm: rockchip: Don't update the state for the caller of pwm_apply_state()
        pwm: Let pwm_get_state() return the last implemented state
        pwm: Introduce local struct pwm_chip in pwm_apply_state()
        ...
      e37e3bc7
    • Linus Torvalds's avatar
      Merge tag 'for-5.4/io_uring-2019-09-27' of git://git.kernel.dk/linux-block · 738f531d
      Linus Torvalds authored
      Pull more io_uring updates from Jens Axboe:
       "Just two things in here:
      
         - Improvement to the io_uring CQ ring wakeup for batched IO (me)
      
         - Fix wrong comparison in poll handling (yangerkun)
      
        I realize the first one is a little late in the game, but it felt
        pointless to hold it off until the next release. Went through various
        testing and reviews with Pavel and peterz"
      
      * tag 'for-5.4/io_uring-2019-09-27' of git://git.kernel.dk/linux-block:
        io_uring: make CQ ring wakeups be more efficient
        io_uring: compare cached_cq_tail with cq.head in_io_uring_poll
      738f531d
    • Colin Ian King's avatar
      net: tap: clean up an indentation issue · faeacb6d
      Colin Ian King authored
      There is a statement that is indented too deeply, remove
      the extraneous tab.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      faeacb6d
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2019-09-27' of git://git.kernel.dk/linux-block · 47db9b9a
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few fixes/changes to round off this merge window. This contains:
      
         - Small series making some functional tweaks to blk-iocost (Tejun)
      
         - Elevator switch locking fix (Ming)
      
         - Kill redundant call in blk-wbt (Yufen)
      
         - Fix flush timeout handling (Yufen)"
      
      * tag 'for-linus-2019-09-27' of git://git.kernel.dk/linux-block:
        block: fix null pointer dereference in blk_mq_rq_timed_out()
        rq-qos: get rid of redundant wbt_update_limits()
        iocost: bump up default latency targets for hard disks
        iocost: improve nr_lagging handling
        iocost: better trace vrate changes
        block: don't release queue's sysfs lock during switching elevator
        blk-mq: move lockdep_assert_held() into elevator_exit
      47db9b9a
    • Navid Emamdoost's avatar
      nfp: abm: fix memory leak in nfp_abm_u32_knode_replace · 78beef62
      Navid Emamdoost authored
      In nfp_abm_u32_knode_replace if the allocation for match fails it should
      go to the error handling instead of returning. Updated other gotos to
      have correct errno returned, too.
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78beef62
    • Eric Dumazet's avatar
      tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state · a41e8a88
      Eric Dumazet authored
      Yuchung Cheng and Marek Majkowski independently reported a weird
      behavior of TCP_USER_TIMEOUT option when used at connect() time.
      
      When the TCP_USER_TIMEOUT is reached, tcp_write_timeout()
      believes the flow should live, and the following condition
      in tcp_clamp_rto_to_user_timeout() programs one jiffie timers :
      
          remaining = icsk->icsk_user_timeout - elapsed;
          if (remaining <= 0)
              return 1; /* user timeout has passed; fire ASAP */
      
      This silly situation ends when the max syn rtx count is reached.
      
      This patch makes sure we honor both TCP_SYNCNT and TCP_USER_TIMEOUT,
      avoiding these spurious SYN packets.
      
      Fixes: b701a99e ("tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarYuchung Cheng <ycheng@google.com>
      Reported-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Cc: Jon Maxwell <jmaxwell37@gmail.com>
      Link: https://marc.info/?l=linux-netdev&m=156940118307949&w=2Acked-by: default avatarJon Maxwell <jmaxwell37@gmail.com>
      Tested-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a41e8a88