1. 23 Mar, 2018 6 commits
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9 · 4bb3c7a0
      Paul Mackerras authored
      POWER9 has hardware bugs relating to transactional memory and thread
      reconfiguration (changes to hardware SMT mode).  Specifically, the core
      does not have enough storage to store a complete checkpoint of all the
      architected state for all four threads.  The DD2.2 version of POWER9
      includes hardware modifications designed to allow hypervisor software
      to implement workarounds for these problems.  This patch implements
      those workarounds in KVM code so that KVM guests see a full, working
      transactional memory implementation.
      
      The problems center around the use of TM suspended state, where the
      CPU has a checkpointed state but execution is not transactional.  The
      workaround is to implement a "fake suspend" state, which looks to the
      guest like suspended state but the CPU does not store a checkpoint.
      In this state, any instruction that would cause a transition to
      transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
      checkpointed state (treclaim) causes a "soft patch" interrupt (vector
      0x1500) to the hypervisor so that it can be emulated.  The trechkpt
      instruction also causes a soft patch interrupt.
      
      On POWER9 DD2.2, we avoid returning to the guest in any state which
      would require a checkpoint to be present.  The trechkpt in the guest
      entry path which would normally create that checkpoint is replaced by
      either a transition to fake suspend state, if the guest is in suspend
      state, or a rollback to the pre-transactional state if the guest is in
      transactional state.  Fake suspend state is indicated by a flag in the
      PACA plus a new bit in the PSSCR.  The new PSSCR bit is write-only and
      reads back as 0.
      
      On exit from the guest, if the guest is in fake suspend state, we still
      do the treclaim instruction as we would in real suspend state, in order
      to get into non-transactional state, but we do not save the resulting
      register state since there was no checkpoint.
      
      Emulation of the instructions that cause a softpatch interrupt is
      handled in two paths.  If the guest is in real suspend mode, we call
      kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
      transitioning to transactional state.  This is called before we do the
      treclaim in the guest exit path; because we haven't done treclaim, we
      can get back to the guest with the transaction still active.  If the
      instruction is a case that kvmhv_p9_tm_emulation_early() doesn't
      handle, or if the guest is in fake suspend state, then we proceed to
      do the complete guest exit path and subsequently call
      kvmhv_p9_tm_emulation() in host context with the MMU on.  This handles
      all the cases including the cases that generate program interrupts
      (illegal instruction or TM Bad Thing) and facility unavailable
      interrupts.
      
      The emulation is reasonably straightforward and is mostly concerned
      with checking for exception conditions and updating the state of
      registers such as MSR and CR0.  The treclaim emulation takes care to
      ensure that the TEXASR register gets updated as if it were the guest
      treclaim instruction that had done failure recording, not the treclaim
      done in hypervisor state in the guest exit path.
      
      With this, the KVM_CAP_PPC_HTM capability returns true (1) even if
      transactional memory is not available to host userspace.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4bb3c7a0
    • Paul Mackerras's avatar
      powerpc/powernv: Provide a way to force a core into SMT4 mode · 7672691a
      Paul Mackerras authored
      POWER9 processors up to and including "Nimbus" v2.2 have hardware
      bugs relating to transactional memory and thread reconfiguration.
      One of these bugs has a workaround which is to get the core into
      SMT4 state temporarily.  This workaround is only needed when
      running bare-metal.
      
      This patch provides a function which gets the core into SMT4 mode
      by preventing threads from going to a stop state, and waking up
      those which are already in a stop state.  Once at least 3 threads
      are not in a stop state, the core will be in SMT4 and we can
      continue.
      
      To do this, we add a "dont_stop" flag to the paca to tell the
      thread not to go into a stop state.  If this flag is set,
      power9_idle_stop() just returns immediately with a return value
      of 0.  The pnv_power9_force_smt4_catch() function does the following:
      
      1. Set the dont_stop flag for each thread in the core, except
         ourselves (in fact we use an atomic_inc() in case more than
         one thread is calling this function concurrently).
      2. See how many threads are awake, indicated by their
         requested_psscr field in the paca being 0.  If this is at
         least 3, skip to step 5.
      3. Send a doorbell interrupt to each thread that was seen as
         being in a stop state in step 2.
      4. Until at least 3 threads are awake, scan the threads to which
         we sent a doorbell interrupt and check if they are awake now.
      
      This relies on the following properties:
      
      - Once dont_stop is non-zero, requested_psccr can't go from zero to
        non-zero, except transiently (and without the thread doing stop).
      - requested_psscr being zero guarantees that the thread isn't in
        a state-losing stop state where thread reconfiguration could occur.
      - Doing stop with a PSSCR value of 0 won't be a state-losing stop
        and thus won't allow thread reconfiguration.
      - Once threads_per_core/2 + 1 (i.e. 3) threads are awake, the core
        must be in SMT4 mode, since SMT modes are powers of 2.
      
      This does add a sync to power9_idle_stop(), which is necessary to
      provide the correct ordering between setting requested_psscr and
      checking dont_stop.  The overhead of the sync should be unnoticeable
      compared to the latency of going into and out of a stop state.
      
      Because some objected to incurring this extra latency on systems where
      the XER[SO] bug is not relevant, I have put the test in
      power9_idle_stop inside a feature section.  This means that
      pnv_power9_force_smt4_catch() WILL NOT WORK correctly on systems
      without the CPU_FTR_P9_TM_XER_SO_BUG feature bit set, and will
      probably hang the system.
      
      In order to cater for uses where the caller has an operation that
      has to be done while the core is in SMT4, the core continues to be
      kept in SMT4 after pnv_power9_force_smt4_catch() function returns,
      until the pnv_power9_force_smt4_release() function is called.
      It undoes the effect of step 1 above and allows the other threads
      to go into a stop state.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7672691a
    • Paul Mackerras's avatar
      powerpc: Add CPU feature bits for TM bug workarounds on POWER9 v2.2 · b5af4f27
      Paul Mackerras authored
      This adds a CPU feature bit which is set for POWER9 "Nimbus" DD2.2
      processors which will be used to enable the hypervisor to assist
      hardware with the handling of checkpointed register values while the
      CPU is in suspend state, in order to work around hardware bugs.  The
      hardware assistance for these workarounds introduced a new hardware
      bug relating to the XER[SO] bit.  We add a separate feature bit for
      this bug in case future chips fix it while still requiring the
      hypervisor assistance with suspend state.
      
      When the dt_cpu_ftrs subsystem is in use, the software assistance can
      be enabled using a "tm-suspend-hypervisor-assist" node in the device
      tree, and a "tm-suspend-xer-so-bug" node enables the workarounds for
      the XER[SO] bug.  In the absence of such nodes, a quirk enables both
      for POWER9 "Nimbus" DD2.2 processors.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b5af4f27
    • Paul Mackerras's avatar
      powerpc: Free up CPU feature bits on 64-bit machines · 9bbf0b57
      Paul Mackerras authored
      This moves all the CPU feature bits that are only used on 32-bit
      machines to the top 20 bits of the CPU feature word and arranges
      for them to be defined only in 32-bit builds.  The features that
      are common to 32-bit and 64-bit machines are moved to bits 0-11
      of the CPU feature word.  This means that for 64-bit platforms,
      bits 44-63 can now be used for new features that only exist on
      64-bit machines.  (These bit numbers are counting from the right,
      i.e. the LSB is bit 0.)
      
      Because CPU_FTR_L3_DISABLE_NAP moved from the low 16 bits to the high
      16 bits, we have to adjust some assembly code.  Also, CPU_FTR_EMB_HV
      moved from the high 16 bits to the low 16 bits.
      
      Note that CPU_FTR_REAL_LE only applies to 64-bit chips, because only
      64-bit chips (POWER6, 7, 8, 9) have a true little-endian mode that is
      a CPU execution mode as opposed to being a page attribute.
      
      With this we now have 20 free CPU feature bits on 64-bit machines.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9bbf0b57
    • Paul Mackerras's avatar
      powerpc: Book E: Remove unused CPU_FTR_L2CSR bit · dd0efb3f
      Paul Mackerras authored
      The CPU_FTR_L2CSR bit is never tested anywhere, so let's reclaim the
      bit.
      
      The last usage was removed in 86d63363 ("powerpc/e500mc: Remove
      dead L2 flushing code in idle_e500.S") (Jun 2015).
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      dd0efb3f
    • Paul Mackerras's avatar
      powerpc: Use feature bit for RTC presence rather than timebase presence · c0d64cf9
      Paul Mackerras authored
      All PowerPC CPUs other than the original PPC601 have a timebase
      register rather than the "real-time clock" (RTC) register that the
      PPC601 (and the original POWER and POWER2 CPUs) had.  Currently
      we have a CPU feature bit to indicate the presence of the timebase,
      but it makes more sense to use a bit to indicate the unusual
      situation rather than the common situation.  This therefore defines
      a CPU_FTR_USE_RTC bit in place of the CPU_FTR_USE_TB bit, and
      arranges for it to be set on PPC601 systems.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c0d64cf9
  2. 11 Feb, 2018 9 commits
    • Linus Torvalds's avatar
      Linux 4.16-rc1 · 7928b2cb
      Linus Torvalds authored
      7928b2cb
    • Al Viro's avatar
      unify {de,}mangle_poll(), get rid of kernel-side POLL... · 7a163b21
      Al Viro authored
      except, again, POLLFREE and POLL_BUSY_LOOP.
      
      With this, we finally get to the promised end result:
      
       - POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any
         stray instances of ->poll() still using those will be caught by
         sparse.
      
       - eventpoll.c and select.c warning-free wrt __poll_t
      
       - no more kernel-side definitions of POLL... - userland ones are
         visible through the entire kernel (and used pretty much only for
         mangle/demangle)
      
       - same behavior as after the first series (i.e. sparc et.al. epoll(2)
         working correctly).
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7a163b21
    • Linus Torvalds's avatar
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds authored
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
    • Linus Torvalds's avatar
      Merge branch 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · ee5daa13
      Linus Torvalds authored
      Pull more poll annotation updates from Al Viro:
       "This is preparation to solving the problems you've mentioned in the
        original poll series.
      
        After this series, the kernel is ready for running
      
            for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
                  L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
                  for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
            done
      
        as a for bulk search-and-replace.
      
        After that, the kernel is ready to apply the patch to unify
        {de,}mangle_poll(), and then get rid of kernel-side POLL... uses
        entirely, and we should be all done with that stuff.
      
        Basically, that's what you suggested wrt KPOLL..., except that we can
        use EPOLL... instead - they already are arch-independent (and equal to
        what is currently kernel-side POLL...).
      
        After the preparations (in this series) switch to returning EPOLL...
        from ->poll() instances is completely mechanical and kernel-side
        POLL... can go away. The last step (killing kernel-side POLL... and
        unifying {de,}mangle_poll() has to be done after the
        search-and-replace job, since we need userland-side POLL... for
        unified {de,}mangle_poll(), thus the cherry-pick at the last step.
      
        After that we will have:
      
         - POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
           ->poll() still using those will be caught by sparse.
      
         - eventpoll.c and select.c warning-free wrt __poll_t
      
         - no more kernel-side definitions of POLL... - userland ones are
           visible through the entire kernel (and used pretty much only for
           mangle/demangle)
      
         - same behavior as after the first series (i.e. sparc et.al. epoll(2)
           working correctly)"
      
      * 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        annotate ep_scan_ready_list()
        ep_send_events_proc(): return result via esed->res
        preparation to switching ->poll() to returning EPOLL...
        add EPOLLNVAL, annotate EPOLL... and event_poll->event
        use linux/poll.h instead of asm/poll.h
        xen: fix poll misannotation
        smc: missing poll annotations
      ee5daa13
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa · 3fc928dc
      Linus Torvalds authored
      Pull xtense fix from Max Filippov:
       "Build fix for xtensa architecture with KASAN enabled"
      
      * tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix build with KASAN
      3fc928dc
    • Linus Torvalds's avatar
      Merge tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 60d7a21a
      Linus Torvalds authored
      Pull nios2 update from Ley Foon Tan:
      
       - clean up old Kconfig options from defconfig
      
       - remove leading 0x and 0s from bindings notation in dts files
      
      * tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: defconfig: Cleanup from old Kconfig options
        nios2: dts: Remove leading 0x and 0s from bindings notation
      60d7a21a
    • Max Filippov's avatar
      xtensa: fix build with KASAN · f8d0cbf2
      Max Filippov authored
      The commit 917538e2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT
      usage") removed KASAN_SHADOW_SCALE_SHIFT definition from
      include/linux/kasan.h and added it to architecture-specific headers,
      except for xtensa. This broke the xtensa build with KASAN enabled.
      Define KASAN_SHADOW_SCALE_SHIFT in arch/xtensa/include/asm/kasan.h
      
      Reported by: kbuild test robot <fengguang.wu@intel.com>
      Fixes: 917538e2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage")
      Acked-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      f8d0cbf2
    • Krzysztof Kozlowski's avatar
      nios2: defconfig: Cleanup from old Kconfig options · e0691ebb
      Krzysztof Kozlowski authored
      Remove old, dead Kconfig option INET_LRO. It is gone since
      commit 7bbf3cae ("ipv4: Remove inet_lro library").
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      e0691ebb
    • Mathieu Malaterre's avatar
      nios2: dts: Remove leading 0x and 0s from bindings notation · 5d13c731
      Mathieu Malaterre authored
      Improve the DTS files by removing all the leading "0x" and zeros to fix the
      following dtc warnings:
      
      Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
      
      and
      
      Warning (unit_address_format): Node /XXX unit name should not have leading 0s
      
      Converted using the following command:
      
      find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -E -i -e "s/@0x([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" -e "s/@0+([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" {} +
      
      For simplicity, two sed expressions were used to solve each warnings separately.
      
      To make the regex expression more robust a few other issues were resolved,
      namely setting unit-address to lower case, and adding a whitespace before the
      the opening curly brace:
      
      https://elinux.org/Device_Tree_Linux#Linux_conventions
      
      This is a follow up to commit 4c9847b7 ("dt-bindings: Remove leading 0x from bindings notation")
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Suggested-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarMathieu Malaterre <malat@debian.org>
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      5d13c731
  3. 10 Feb, 2018 14 commits
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · d48fcbd8
      Linus Torvalds authored
      Pull PCI fix from Bjorn Helgaas:
       "Fix a POWER9/powernv INTx regression from the merge window (Alexey
        Kardashevskiy)"
      
      * tag 'pci-v4.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        powerpc/pci: Fix broken INTx configuration via OF
      d48fcbd8
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180210' of git://git.kernel.dk/linux-block · 9454473c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few fixes to round off the merge window on the block side:
      
         - a set of bcache fixes by way of Michael Lyle, from the usual bcache
           suspects.
      
         - add a simple-to-hook-into function for bpf EIO error injection.
      
         - fix blk-wbt that mischarectized flushes as reads. Improve the logic
           so that flushes and writes are accounted as writes, and only reads
           as reads. From me.
      
         - fix requeue crash in BFQ, from Paolo"
      
      * tag 'for-linus-20180210' of git://git.kernel.dk/linux-block:
        block, bfq: add requeue-request hook
        bcache: fix for data collapse after re-attaching an attached device
        bcache: return attach error when no cache set exist
        bcache: set writeback_rate_update_seconds in range [1, 60] seconds
        bcache: fix for allocator and register thread race
        bcache: set error_limit correctly
        bcache: properly set task state in bch_writeback_thread()
        bcache: fix high CPU occupancy during journal
        bcache: add journal statistic
        block: Add should_fail_bio() for bpf error injection
        blk-wbt: account flush requests correctly
      9454473c
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86 · cc5cb5af
      Linus Torvalds authored
      Pull x86 platform driver updates from Darren Hart:
       "Mellanox fixes and new system type support.
      
        Mostly data for new system types with a correction and an
        uninitialized variable fix"
      
      [ Pulling from github because git.infradead.org currently seems to be
        down for some reason, but Darren had a backup location    - Linus ]
      
      * tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86:
        platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems
        platform/x86: mlx-platform: Add support for new msn201x system type
        platform/x86: mlx-platform: Add support for new msn274x system type
        platform/x86: mlx-platform: Fix power cable setting for msn21xx family
        platform/x86: mlx-platform: Add define for the negative bus
        platform/x86: mlx-platform: Use defines for bus assignment
        platform/mellanox: mlxreg-hotplug: Fix uninitialized variable
      cc5cb5af
    • Linus Torvalds's avatar
      Merge tag 'chrome-platform-for-linus-4.16' of... · e9d46f74
      Linus Torvalds authored
      Merge tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
      
      Pull chrome platform updates from Benson Leung:
      
       - move cros_ec_dev to drivers/mfd
      
       - other small maintenance fixes
      
      [ The cros_ec_dev movement came in earlier through the MFD tree  - Linus ]
      
      * tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
        platform/chrome: Use proper protocol transfer function
        platform/chrome: cros_ec_lpc: Add support for Google Glimmer
        platform/chrome: cros_ec_lpc: Register the driver if ACPI entry is missing.
        platform/chrome: cros_ec_lpc: remove redundant pointer request
        cros_ec: fix nul-termination for firmware build info
        platform/chrome: chromeos_laptop: make chromeos_laptop const
      e9d46f74
    • Linus Torvalds's avatar
      Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 15303ba5
      Linus Torvalds authored
      Pull KVM updates from Radim Krčmář:
       "ARM:
      
         - icache invalidation optimizations, improving VM startup time
      
         - support for forwarded level-triggered interrupts, improving
           performance for timers and passthrough platform devices
      
         - a small fix for power-management notifiers, and some cosmetic
           changes
      
        PPC:
      
         - add MMIO emulation for vector loads and stores
      
         - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
           requiring the complex thread synchronization of older CPU versions
      
         - improve the handling of escalation interrupts with the XIVE
           interrupt controller
      
         - support decrement register migration
      
         - various cleanups and bugfixes.
      
        s390:
      
         - Cornelia Huck passed maintainership to Janosch Frank
      
         - exitless interrupts for emulated devices
      
         - cleanup of cpuflag handling
      
         - kvm_stat counter improvements
      
         - VSIE improvements
      
         - mm cleanup
      
        x86:
      
         - hypervisor part of SEV
      
         - UMIP, RDPID, and MSR_SMI_COUNT emulation
      
         - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
      
         - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
           AVX512 features
      
         - show vcpu id in its anonymous inode name
      
         - many fixes and cleanups
      
         - per-VCPU MSR bitmaps (already merged through x86/pti branch)
      
         - stable KVM clock when nesting on Hyper-V (merged through
           x86/hyperv)"
      
      * tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
        KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
        KVM: PPC: Book3S HV: Branch inside feature section
        KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
        KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
        KVM: PPC: Book3S PR: Fix broken select due to misspelling
        KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
        KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
        KVM: PPC: Book3S HV: Drop locks before reading guest memory
        kvm: x86: remove efer_reload entry in kvm_vcpu_stat
        KVM: x86: AMD Processor Topology Information
        x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
        kvm: embed vcpu id to dentry of vcpu anon inode
        kvm: Map PFN-type memory regions as writable (if possible)
        x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
        KVM: arm/arm64: Fixup userspace irqchip static key optimization
        KVM: arm/arm64: Fix userspace_irqchip_in_use counting
        KVM: arm/arm64: Fix incorrect timer_is_pending logic
        MAINTAINERS: update KVM/s390 maintainers
        MAINTAINERS: add Halil as additional vfio-ccw maintainer
        MAINTAINERS: add David as a reviewer for KVM/s390
        ...
      15303ba5
    • Alexey Kardashevskiy's avatar
      powerpc/pci: Fix broken INTx configuration via OF · c591c2e3
      Alexey Kardashevskiy authored
      59f47eff ("powerpc/pci: Use of_irq_parse_and_map_pci() helper")
      replaced of_irq_parse_pci() + irq_create_of_mapping() with
      of_irq_parse_and_map_pci(), but neglected to capture the virq
      returned by irq_create_of_mapping(), so virq remained zero, which
      caused INTx configuration to fail.
      
      Save the virq value returned by of_irq_parse_and_map_pci() and correct
      the virq declaration to match the of_irq_parse_and_map_pci() signature.
      
      Fixes: 59f47eff "powerpc/pci: Use of_irq_parse_and_map_pci() helper"
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      [bhelgaas: changelog]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      c591c2e3
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 9a61df9e
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
       "Makefile changes:
         - enable unused-variable warning that was wrongly disabled for clang
      
        Kconfig changes:
         - warn about blank 'help' and fix existing instances
         - fix 'choice' behavior to not write out invisible symbols
         - fix misc weirdness
      
        Coccinell changes:
         - fix false positive of free after managed memory alloc detection
         - improve performance of NULL dereference detection"
      
      * tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
        kconfig: remove const qualifier from sym_expand_string_value()
        kconfig: add xrealloc() helper
        kconfig: send error messages to stderr
        kconfig: echo stdin to stdout if either is redirected
        kconfig: remove check_stdin()
        kconfig: remove 'config*' pattern from .gitignnore
        kconfig: show '?' prompt even if no help text is available
        kconfig: do not write choice values when their dependency becomes n
        coccinelle: deref_null: avoid useless computation
        coccinelle: devm_free: reduce false positives
        kbuild: clang: disable unused variable warnings only when constant
        kconfig: Warn if help text is blank
        nios2: kconfig: Remove blank help text
        arm: vt8500: kconfig: Remove blank help text
        MIPS: kconfig: Remove blank help text
        MIPS: BCM63XX: kconfig: Remove blank help text
        lib/Kconfig.debug: Remove blank help text
        Staging: rtl8192e: kconfig: Remove blank help text
        Staging: rtl8192u: kconfig: Remove blank help text
        mmc: kconfig: Remove blank help text
        ...
      9a61df9e
    • Al Viro's avatar
      7a501609
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 878e66d0
      Linus Torvalds authored
      Pull misc vfs fixes from Al Viro.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        seq_file: fix incomplete reset on read from zero offset
        kernfs: fix regression in kernfs_fop_write caused by wrong type
      878e66d0
    • Masahiro Yamada's avatar
      kconfig: remove const qualifier from sym_expand_string_value() · 523ca58b
      Masahiro Yamada authored
      This function returns realloc'ed memory, so the returned pointer
      must be passed to free() when done.  So, 'const' qualifier is odd.
      It is allowed to modify the expanded string.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      523ca58b
    • Masahiro Yamada's avatar
      kconfig: add xrealloc() helper · d717f24d
      Masahiro Yamada authored
      We already have xmalloc(), xcalloc().  Add xrealloc() as well
      to save tedious error handling.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      d717f24d
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems · 1bd42d94
      Vadim Pasternak authored
      It adds support for new Mellanox system types of basic classes qmb7, sn34,
      sn37, containing systems QMB700 (40x200GbE InfiniBand switch), SN3700
      (32x200GbE and 16x400GbE Ethernet switch) and SN3410 (6x400GbE plus
      48x50GbE Ethernet switch). These are the Top of the Rack systems, equipped
      with Mellanox COM-Express carrier board and switch board with Mellanox
      Quantum device, which supports InfiniBand switching with 40X200G ports and
      line rate of up to HDR speed or with Mellanox Spectrum-2 device, which
      supports Ethernet switching with 32X200G ports line rate of up to HDR
      speed.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      1bd42d94
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for new msn201x system type · a49a4148
      Vadim Pasternak authored
      It adds support for new Mellanox system types of basic half unit size
      class msn201x, containing system MSN2010 (18x10GbE plus 4x4x25GbE) half
      and its derivatives. This is the Top of the Rack system, equipped with
      Mellanox Small Form Factor carrier board and switch board with Mellanox
      Spectrum device, which supports Ethernet switching with 32X100G ports line
      rate of up to EDR speed.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      a49a4148
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for new msn274x system type · ef08e14a
      Vadim Pasternak authored
      It adds support for new Mellanox system types of basic class msn274x,
      containing system MSN2740 (32x100GbE Ethernet switch with cost reduction)
      and its derivatives. These are the Top of the Rack system, equipped with
      Mellanox Small Form Factor carrier board and switch board with Mellanox
      Spectrum device, which supports Ethernet switching with 32X100G ports line
      rate of up to EDR speed.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      ef08e14a
  4. 09 Feb, 2018 11 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c839682c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Make allocations less aggressive in x_tables, from Minchal Hocko.
      
       2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.
      
       3) Fix connection loss problems in rtlwifi, from Larry Finger.
      
       4) Correct DRAM dump length for some chips in ath10k driver, from Yu
          Wang.
      
       5) Fix ABORT handling in rxrpc, from David Howells.
      
       6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.
      
       7) Some ipv6 onlink handling fixes, from David Ahern.
      
       8) Netem packet scheduler interval calcualtion fix from Md. Islam.
      
       9) Don't put crypto buffers on-stack in rxrpc, from David Howells.
      
      10) Fix handling of error non-delivery status in netlink multicast
          delivery over multiple namespaces, from Nicolas Dichtel.
      
      11) Missing xdp flush in tuntap driver, from Jason Wang.
      
      12) Synchonize RDS protocol netns/module teardown with rds object
          management, from Sowini Varadhan.
      
      13) Add nospec annotations to mpls, from Dan Williams.
      
      14) Fix SKB truesize handling in TIPC, from Hoang Le.
      
      15) Interrupt masking fixes in stammc from Niklas Cassel.
      
      16) Don't allow ptr_ring objects to be sized outside of kmalloc's
          limits, from Jason Wang.
      
      17) Don't allow SCTP chunks to be built which will have a length
          exceeding the chunk header's 16-bit length field, from Alexey
          Kodanev.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
        ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
        bpf: fix rlimit in reuseport net selftest
        sctp: verify size of a new chunk in _sctp_make_chunk()
        s390/qeth: fix SETIP command handling
        s390/qeth: fix underestimated count of buffer elements
        ptr_ring: try vmalloc() when kmalloc() fails
        ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
        net: stmmac: remove redundant enable of PMT irq
        net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
        net: stmmac: discard disabled flags in interrupt status register
        ibmvnic: Reset long term map ID counter
        tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
        selftests/bpf: add selftest that use test_libbpf_open
        selftests/bpf: add test program for loading BPF ELF files
        tools/libbpf: improve the pr_debug statements to contain section numbers
        bpf: Sync kernel ABI header with tooling header for bpf_common.h
        net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
        net: thunder: change q_len's type to handle max ring size
        tipc: fix skb truesize/datasize ratio control
        net/sched: cls_u32: fix cls_u32 on filter replace
        ...
      c839682c
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 82f0a41e
      Linus Torvalds authored
      Pull more NFS client updates from Trond Myklebust:
       "A few bugfixes and some small sunrpc latency/performance improvements
        before the merge window closes:
      
        Stable fixes:
      
         - fix an incorrect calculation of the RDMA send scatter gather
           element limit
      
         - fix an Oops when attempting to free resources after RDMA device
           removal
      
        Bugfixes:
      
         - SUNRPC: Ensure we always release the TCP socket in a timely fashion
           when the connection is shut down.
      
         - SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
      
        Latency/Performance:
      
         - SUNRPC: Queue latency sensitive socket tasks to the less contended
           xprtiod queue
      
         - SUNRPC: Make the xprtiod workqueue unbounded.
      
         - SUNRPC: Make the rpciod workqueue unbounded"
      
      * tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
        fix parallelism for rpc tasks
        Make the xprtiod workqueue unbounded.
        SUNRPC: Queue latency-sensitive socket tasks to xprtiod
        SUNRPC: Ensure we always close the socket after a connection shuts down
        xprtrdma: Fix BUG after a device removal
        xprtrdma: Fix calculation of ri_max_send_sges
      82f0a41e
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 858f45bf
      Linus Torvalds authored
      Pull SCSI target updates from Nicholas Bellinger:
       "The highlights include:
      
         - numerous target-core-user improvements related to queue full and
           timeout handling. (MNC)
      
         - prevent target-core-user corruption when invalid data page is
           requested. (MNC)
      
         - add target-core device action configfs attributes to allow
           user-space to trigger events separate from existing attributes
           exposed to end-users. (MNC)
      
         - fix iscsi-target NULL pointer dereference 4.6+ regression in CHAP
           error path. (David Disseldorp)
      
         - avoid target-core backend UNMAP callbacks if range is zero. (Andrei
           Vagin)
      
         - fix a iscsi-target 4.14+ regression related multiple PDU logins,
           that was exposed due to removal of TCP prequeue support. (Florian
           Westphal + MNC)
      
        Also, there is a iser-target bug still being worked on for post -rc1
        code to address a long standing issue resulting in persistent
        ib_post_send() failures, for RNICs with small max_send_sge"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
        iscsi-target: make sure to wake up sleeping login worker
        tcmu: Fix trailing semicolon
        tcmu: fix cmd user after free
        target: fix destroy device in target_configure_device
        tcmu: allow userspace to reset ring
        target core: add device action configfs files
        tcmu: fix error return code in tcmu_configure_device()
        target_core_user: add cmd id to broken ring message
        target: add SAM_STAT_BUSY sense reason
        tcmu: prevent corruption when invalid data page requested
        target: don't call an unmap callback if a range length is zero
        target/iscsi: avoid NULL dereference in CHAP auth error path
        cxgbit: call neigh_event_send() to update MAC address
        target: tcm_loop: Use seq_puts() in tcm_loop_show_info()
        target: tcm_loop: Delete an unnecessary return statement in tcm_loop_submission_work()
        target: tcm_loop: Delete two unnecessary variable initialisations in tcm_loop_issue_tmr()
        target: tcm_loop: Combine substrings for 26 messages
        target: tcm_loop: Improve a size determination in two functions
        target: tcm_loop: Delete an error message for a failed memory allocation in four functions
        sbp-target: Delete an error message for a failed memory allocation in three functions
        ...
      858f45bf
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 8158c2ff
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Al Viro discovered some breakage with the parsing of the
        set_ftrace_filter as well as the removing of function probes.
      
        This fixes the code with Al's suggestions. I also added a few
        selftests to test the broken cases such that they wont happen
        again"
      
      * tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        selftests/ftrace: Add more tests for removing of function probes
        selftests/ftrace: Add some missing glob checks
        selftests/ftrace: Have reset_ftrace_filter handle multiple instances
        selftests/ftrace: Have reset_ftrace_filter handle modules
        tracing: Fix parsing of globs with a wildcard at the beginning
        ftrace: Remove incorrect setting of glob search field
      8158c2ff
    • Linus Torvalds's avatar
      Merge tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · a2834832
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "There are a couple additional security fixes that are still being
        tested that are not in this set."
      
      * tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        Add missing structs and defines from recent SMB3.1.1 documentation
        address lock imbalance warnings in smbdirect.c
        cifs: silence compiler warnings showing up with gcc-8.0.0
        Add some missing debug fields in server and tcon structs
      a2834832
    • Linus Torvalds's avatar
      Merge tag 'fbdev-v4.16-fix' of git://github.com/bzolnier/linux · 58fcba61
      Linus Torvalds authored
      Pull fbdev fix from Bartlomiej Zolnierkiewicz:
       "Fix building of the omapfb driver (Tomi Valkeinen)"
      
      * tag 'fbdev-v4.16-fix' of git://github.com/bzolnier/linux:
        video: omapfb: fix missing #includes
      58fcba61
    • Radim Krčmář's avatar
      Merge tag 'kvm-ppc-next-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · 1ab03c07
      Radim Krčmář authored
      Second PPC KVM update for 4.16
      
      Seven fixes that are either trivial or that address bugs that people
      are actually hitting.  The main ones are:
      
      - Drop spinlocks before reading guest memory
      
      - Fix a bug causing corruption of VCPU state in PR KVM with preemption
        enabled
      
      - Make HPT resizing work on POWER9
      
      - Add MMIO emulation for vector loads and stores, because guests now
        use these instructions in memcpy and similar routines.
      1ab03c07
    • Radim Krčmář's avatar
      Merge branch 'msr-bitmaps' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 80132f4c
      Radim Krčmář authored
      This topic branch allocates separate MSR bitmaps for each VCPU.
      This is required for the IBRS enablement to choose, on a per-VM
      basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs;
      the IBRS enablement comes in through the tip tree.
      80132f4c
    • John Allen's avatar
      ibmvnic: Remove skb->protocol checks in ibmvnic_xmit · 2fa56a49
      John Allen authored
      Having these checks in ibmvnic_xmit causes problems with VLAN
      tagging and balance-alb/tlb bonding modes. The restriction they
      imposed can be removed.
      Signed-off-by: default avatarJohn Allen <jallen@linux.vnet.ibm.com>
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2fa56a49
    • Daniel Borkmann's avatar
      bpf: fix rlimit in reuseport net selftest · 941ff6f1
      Daniel Borkmann authored
      Fix two issues in the reuseport_bpf selftests that were
      reported by Linaro CI:
      
        [...]
        + ./reuseport_bpf
        ---- IPv4 UDP ----
        Testing EBPF mod 10...
        Reprograming, testing mod 5...
        ./reuseport_bpf: ebpf error. log:
        0: (bf) r6 = r1
        1: (20) r0 = *(u32 *)skb[0]
        2: (97) r0 %= 10
        3: (95) exit
        processed 4 insns
        : Operation not permitted
        + echo FAIL
        [...]
        ---- IPv4 TCP ----
        Testing EBPF mod 10...
        ./reuseport_bpf: failed to bind send socket: Address already in use
        + echo FAIL
        [...]
      
      For the former adjust rlimit since this was the cause of
      failure for loading the BPF prog, and for the latter add
      SO_REUSEADDR.
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Link: https://bugs.linaro.org/show_bug.cgi?id=3502Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      941ff6f1
    • Alexey Kodanev's avatar
      sctp: verify size of a new chunk in _sctp_make_chunk() · 07f2c7ab
      Alexey Kodanev authored
      When SCTP makes INIT or INIT_ACK packet the total chunk length
      can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
      transmitting these packets, e.g. the crash on sending INIT_ACK:
      
      [  597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
                     put:120156 head:000000007aa47635 data:00000000d991c2de
                     tail:0x1d640 end:0xfec0 dev:<NULL>
      ...
      [  597.976970] ------------[ cut here ]------------
      [  598.033408] kernel BUG at net/core/skbuff.c:104!
      [  600.314841] Call Trace:
      [  600.345829]  <IRQ>
      [  600.371639]  ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
      [  600.436934]  skb_put+0x16c/0x200
      [  600.477295]  sctp_packet_transmit+0x2095/0x26d0 [sctp]
      [  600.540630]  ? sctp_packet_config+0x890/0x890 [sctp]
      [  600.601781]  ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
      [  600.671356]  ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
      [  600.731482]  sctp_outq_flush+0x663/0x30d0 [sctp]
      [  600.788565]  ? sctp_make_init+0xbf0/0xbf0 [sctp]
      [  600.845555]  ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
      [  600.912945]  ? sctp_outq_tail+0x631/0x9d0 [sctp]
      [  600.969936]  sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
      [  601.041593]  ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
      [  601.104837]  ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
      [  601.175436]  ? sctp_eat_data+0x1710/0x1710 [sctp]
      [  601.233575]  sctp_do_sm+0x182/0x560 [sctp]
      [  601.284328]  ? sctp_has_association+0x70/0x70 [sctp]
      [  601.345586]  ? sctp_rcv+0xef4/0x32f0 [sctp]
      [  601.397478]  ? sctp6_rcv+0xa/0x20 [sctp]
      ...
      
      Here the chunk size for INIT_ACK packet becomes too big, mostly
      because of the state cookie (INIT packet has large size with
      many address parameters), plus additional server parameters.
      
      Later this chunk causes the panic in skb_put_data():
      
        skb_packet_transmit()
            sctp_packet_pack()
                skb_put_data(nskb, chunk->skb->data, chunk->skb->len);
      
      'nskb' (head skb) was previously allocated with packet->size
      from u16 'chunk->chunk_hdr->length'.
      
      As suggested by Marcelo we should check the chunk's length in
      _sctp_make_chunk() before trying to allocate skb for it and
      discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leinter@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f2c7ab