1. 10 Jun, 2016 40 commits
    • Andy Honig's avatar
      KVM: MTRR: remove MSR 0x2f8 · 62ab3693
      Andy Honig authored
      commit 9842df62 upstream.
      
      MSR 0x2f8 accessed the 124th Variable Range MTRR ever since MTRR support
      was introduced by 9ba075a6 ("KVM: MTRR support").
      
      0x2f8 became harmful when 910a6aae ("KVM: MTRR: exactly define the
      size of variable MTRRs") shrinked the array of VR MTRRs from 256 to 8,
      which made access to index 124 out of bounds.  The surrounding code only
      WARNs in this situation, thus the guest gained a limited read/write
      access to struct kvm_arch_vcpu.
      
      0x2f8 is not a valid VR MTRR MSR, because KVM has/advertises only 16 VR
      MTRR MSRs, 0x200-0x20f.  Every VR MTRR is set up using two MSRs, 0x2f8
      was treated as a PHYSBASE and 0x2f9 would be its PHYSMASK, but 0x2f9 was
      not implemented in KVM, therefore 0x2f8 could never do anything useful
      and getting rid of it is safe.
      
      This fixes CVE-2016-3713.
      
      Fixes: 910a6aae ("KVM: MTRR: exactly define the size of variable MTRRs")
      Reported-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarAndy Honig <ahonig@google.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      62ab3693
    • Dave Chinner's avatar
      xfs: skip stale inodes in xfs_iflush_cluster · 0cb2d772
      Dave Chinner authored
      commit 7d3aa7fe upstream.
      
      We don't write back stale inodes so we should skip them in
      xfs_iflush_cluster, too.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0cb2d772
    • Dave Chinner's avatar
      xfs: fix inode validity check in xfs_iflush_cluster · 87994ff4
      Dave Chinner authored
      commit 51b07f30 upstream.
      
      Some careless idiot(*) wrote crap code in commit 1a3e8f3d ("xfs:
      convert inode cache lookups to use RCU locking") back in late 2010,
      and so xfs_iflush_cluster checks the wrong inode for whether it is
      still valid under RCU protection. Fix it to lock and check the
      correct inode.
      
      (*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
      Discovered-by: default avatarBrain Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      87994ff4
    • Dave Chinner's avatar
      xfs: xfs_iflush_cluster fails to abort on error · 9b63d6af
      Dave Chinner authored
      commit b1438f47 upstream.
      
      When a failure due to an inode buffer occurs, the error handling
      fails to abort the inode writeback correctly. This can result in the
      inode being reclaimed whilst still in the AIL, leading to
      use-after-free situations as well as filesystems that cannot be
      unmounted as the inode log items left in the AIL never get removed.
      
      Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
      the inode flush being aborted correctly.
      Reported-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Diagnosed-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Tested-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      9b63d6af
    • Daniel Lezcano's avatar
      cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter() · 3f53052e
      Daniel Lezcano authored
      commit e7387da5 upstream.
      
      Commit 0b89e9aa (cpuidle: delay enabling interrupts until all
      coupled CPUs leave idle) rightfully fixed a regression by letting
      the coupled idle state framework to handle local interrupt enabling
      when the CPU is exiting an idle state.
      
      The current code checks if the idle state is coupled and, if so, it
      will let the coupled code to enable interrupts. This way, it can
      decrement the ready-count before handling the interrupt. This
      mechanism prevents the other CPUs from waiting for a CPU which is
      handling interrupts.
      
      But the check is done against the state index returned by the back
      end driver's ->enter functions which could be different from the
      initial index passed as parameter to the cpuidle_enter_state()
      function.
      
       entered_state = target_state->enter(dev, drv, index);
      
       [ ... ]
      
       if (!cpuidle_state_is_coupled(drv, entered_state))
      	local_irq_enable();
      
       [ ... ]
      
      If the 'index' is referring to a coupled idle state but the
      'entered_state' is *not* coupled, then the interrupts are enabled
      again. All CPUs blocked on the sync barrier may busy loop longer
      if the CPU has interrupts to handle before decrementing the
      ready-count. That's consuming more energy than saving.
      
      Fixes: 0b89e9aa (cpuidle: delay enabling interrupts until all coupled CPUs leave idle)
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      [ rjw: Subject & changelog ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      [ kamal: backport to 4.2-stable: context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3f53052e
    • Steve French's avatar
      remove directory incorrectly tries to set delete on close on non-empty directories · b2d06481
      Steve French authored
      commit 897fba11 upstream.
      
      Wrong return code was being returned on SMB3 rmdir of
      non-empty directory.
      
      For SMB3 (unlike for cifs), we attempt to delete a directory by
      set of delete on close flag on the open. Windows clients set
      this flag via a set info (SET_FILE_DISPOSITION to set this flag)
      which properly checks if the directory is empty.
      
      With this patch on smb3 mounts we correctly return
       "DIRECTORY NOT EMPTY"
      on attempts to remove a non-empty directory.
      Signed-off-by: default avatarSteve French <steve.french@primarydata.com>
      Acked-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b2d06481
    • Stefan Metzmacher's avatar
      fs/cifs: correctly to anonymous authentication for the NTLM(v2) authentication · 03161ff5
      Stefan Metzmacher authored
      commit 1a967d6c upstream.
      
      Only server which map unknown users to guest will allow
      access using a non-null NTLMv2_Response.
      
      For Samba it's the "map to guest = bad user" option.
      
      BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: default avatarStefan Metzmacher <metze@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      03161ff5
    • Stefan Metzmacher's avatar
      fs/cifs: correctly to anonymous authentication for the NTLM(v1) authentication · 0430e786
      Stefan Metzmacher authored
      commit 777f69b8 upstream.
      
      Only server which map unknown users to guest will allow
      access using a non-null NTChallengeResponse.
      
      For Samba it's the "map to guest = bad user" option.
      
      BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: default avatarStefan Metzmacher <metze@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0430e786
    • Stefan Metzmacher's avatar
      fs/cifs: correctly to anonymous authentication for the LANMAN authentication · b1b5741b
      Stefan Metzmacher authored
      commit fa8f3a35 upstream.
      
      Only server which map unknown users to guest will allow
      access using a non-null LMChallengeResponse.
      
      For Samba it's the "map to guest = bad user" option.
      
      BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: default avatarStefan Metzmacher <metze@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b1b5741b
    • Stefan Metzmacher's avatar
      fs/cifs: correctly to anonymous authentication via NTLMSSP · 051868ea
      Stefan Metzmacher authored
      commit cfda35d9 upstream.
      
      See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client:
      
         ...
         Set NullSession to FALSE
         If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND
            AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND
            (AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1)
             OR
             AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0))
             -- Special case: client requested anonymous authentication
             Set NullSession to TRUE
         ...
      
      Only server which map unknown users to guest will allow
      access using a non-null NTChallengeResponse.
      
      For Samba it's the "map to guest = bad user" option.
      
      BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913Signed-off-by: default avatarStefan Metzmacher <metze@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      051868ea
    • Lyude's avatar
      drm/fb_helper: Fix references to dev->mode_config.num_connector · a31acb15
      Lyude authored
      commit 255f0e7c upstream.
      
      During boot, MST hotplugs are generally expected (even if no physical
      hotplugging occurs) and result in DRM's connector topology changing.
      This means that using num_connector from the current mode configuration
      can lead to the number of connectors changing under us. This can lead to
      some nasty scenarios in fbcon:
      
      - We allocate an array to the size of dev->mode_config.num_connectors.
      - MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
      - We try to loop through each element in the array using the new value
        of dev->mode_config.num_connectors, and end up going out of bounds
        since dev->mode_config.num_connectors is now larger then the array we
        allocated.
      
      fb_helper->connector_count however, will always remain consistent while
      we do a modeset in fb_helper.
      
      Note: This is just polish for 4.7, Dave Airlie's drm_connector
      refcounting fixed these bugs for real. But it's good enough duct-tape
      for stable kernel backporting, since backporting the refcounting
      changes is way too invasive.
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      [danvet: Clarify why we need this. Also remove the now unused "dev"
      local variable to appease gcc.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-3-git-send-email-cpaul@redhat.comSigned-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a31acb15
    • Lyude's avatar
      drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() · 29b35727
      Lyude authored
      commit 14a3842a upstream.
      
      During boot time, MST devices usually send a ton of hotplug events
      irregardless of whether or not any physical hotplugs actually occurred.
      Hotplugs mean connectors being created/destroyed, and the number of DRM
      connectors changing under us. This isn't a problem if we use
      fb_helper->connector_count since we only set it once in the code,
      however if we use num_connector from struct drm_mode_config we risk it's
      value changing under us. On top of that, there's even a chance that
      dev->mode_config.num_connector != fb_helper->connector_count. If the
      number of connectors happens to increase under us, we'll end up using
      the wrong array size for memcpy and start writing beyond the actual
      length of the array, occasionally resulting in kernel panics.
      
      Note: This is just polish for 4.7, Dave Airlie's drm_connector
      refcounting fixed these bugs for real. But it's good enough duct-tape
      for stable kernel backporting, since backporting the refcounting
      changes is way too invasive.
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      [danvet: Clarify why we need this.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-2-git-send-email-cpaul@redhat.comSigned-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      29b35727
    • Maciej W. Rozycki's avatar
      MIPS: MSA: Fix a link error on `_init_msa_upper' with older GCC · 8a51e226
      Maciej W. Rozycki authored
      commit e49d3848 upstream.
      
      Fix a build regression from commit c9017757 ("MIPS: init upper 64b
      of vector registers when MSA is first used"):
      
      arch/mips/built-in.o: In function `enable_restore_fp_context':
      traps.c:(.text+0xbb90): undefined reference to `_init_msa_upper'
      traps.c:(.text+0xbb90): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper'
      traps.c:(.text+0xbef0): undefined reference to `_init_msa_upper'
      traps.c:(.text+0xbef0): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper'
      
      to !CONFIG_CPU_HAS_MSA configurations with older GCC versions, which are
      unable to figure out that calls to `_init_msa_upper' are indeed dead.
      Of the many ways to tackle this failure choose the approach we have
      already taken in `thread_msa_context_live'.
      
      [ralf@linux-mips.org: Drop patch segment to junk file.]
      Signed-off-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13271/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8a51e226
    • Prarit Bhargava's avatar
      PCI: Disable all BAR sizing for devices with non-compliant BARs · bcdc7d4a
      Prarit Bhargava authored
      commit ad67b437 upstream.
      
      b84106b4 ("PCI: Disable IO/MEM decoding for devices with non-compliant
      BARs") disabled BAR sizing for BARs 0-5 of devices that don't comply with
      the PCI spec.  But it didn't do anything for expansion ROM BARs, so we
      still try to size them, resulting in warnings like this on Broadwell-EP:
      
        pci 0000:ff:12.0: BAR 6: failed to assign [mem size 0x00000001 pref]
      
      Move the non-compliant BAR check from __pci_read_base() up to
      pci_read_bases() so it applies to the expansion ROM BAR as well as
      to BARs 0-5.
      
      Note that direct callers of __pci_read_base(), like sriov_init(), will now
      bypass this check.  We haven't had reports of devices with broken SR-IOV
      BARs yet.
      
      [bhelgaas: changelog]
      Fixes: b84106b4 ("PCI: Disable IO/MEM decoding for devices with non-compliant BARs")
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Ingo Molnar <mingo@redhat.com>
      CC: "H. Peter Anvin" <hpa@zytor.com>
      CC: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      bcdc7d4a
    • Adrian Hunter's avatar
      mmc: mmc: Fix partition switch timeout for some eMMCs · 05daa23f
      Adrian Hunter authored
      commit 1c447116 upstream.
      
      Some eMMCs set the partition switch timeout too low.
      
      Now typically eMMCs are considered a critical component (e.g. because
      they store the root file system) and consequently are expected to be
      reliable.  Thus we can neglect the use case where eMMCs can't switch
      reliably and we might want a lower timeout to facilitate speedy
      recovery.
      
      Although we could employ a quirk for the cards that are affected (if
      we could identify them all), as described above, there is little
      benefit to having a low timeout, so instead simply set a minimum
      timeout.
      
      The minimum is set to 300ms somewhat arbitrarily - the examples that
      have been seen had a timeout of 10ms but were sometimes taking 60-70ms.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      05daa23f
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Prevent overflow of size in ring_buffer_resize() · 5be13df6
      Steven Rostedt (Red Hat) authored
      commit 59643d15 upstream.
      
      If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE
      then the DIV_ROUND_UP() will return zero.
      
      Here's the details:
      
        # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb
      
      tracing_entries_write() processes this and converts kb to bytes.
      
       18014398509481980 << 10 = 18446744073709547520
      
      and this is passed to ring_buffer_resize() as unsigned long size.
      
       size = DIV_ROUND_UP(size, BUF_PAGE_SIZE);
      
      Where DIV_ROUND_UP(a, b) is (a + b - 1)/b
      
      BUF_PAGE_SIZE is 4080 and here
      
       18446744073709547520 + 4080 - 1 = 18446744073709551599
      
      where 18446744073709551599 is still smaller than 2^64
      
       2^64 - 18446744073709551599 = 17
      
      But now 18446744073709551599 / 4080 = 4521260802379792
      
      and size = size * 4080 = 18446744073709551360
      
      This is checked to make sure its still greater than 2 * 4080,
      which it is.
      
      Then we convert to the number of buffer pages needed.
      
       nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE)
      
      but this time size is 18446744073709551360 and
      
       2^64 - (18446744073709551360 + 4080 - 1) = -3823
      
      Thus it overflows and the resulting number is less than 4080, which makes
      
        3823 / 4080 = 0
      
      an nr_pages is set to this. As we already checked against the minimum that
      nr_pages may be, this causes the logic to fail as well, and we crash the
      kernel.
      
      There's no reason to have the two DIV_ROUND_UP() (that's just result of
      historical code changes), clean up the code and fix this bug.
      
      Fixes: 83f40318 ("ring-buffer: Make removal of ring buffer pages atomic")
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5be13df6
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Use long for nr_pages to avoid overflow failures · de2c7afc
      Steven Rostedt (Red Hat) authored
      commit 9b94a8fb upstream.
      
      The size variable to change the ring buffer in ftrace is a long. The
      nr_pages used to update the ring buffer based on the size is int. On 64 bit
      machines this can cause an overflow problem.
      
      For example, the following will cause the ring buffer to crash:
      
       # cd /sys/kernel/debug/tracing
       # echo 10 > buffer_size_kb
       # echo 8556384240 > buffer_size_kb
      
      Then you get the warning of:
      
       WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260
      
      Which is:
      
        RB_WARN_ON(cpu_buffer, nr_removed);
      
      Note each ring buffer page holds 4080 bytes.
      
      This is because:
      
       1) 10 causes the ring buffer to have 3 pages.
          (10kb requires 3 * 4080 pages to hold)
      
       2) (2^31 / 2^10  + 1) * 4080 = 8556384240
          The value written into buffer_size_kb is shifted by 10 and then passed
          to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760
      
       3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
          which is 4080. 8761737461760 / 4080 = 2147484672
      
       4) nr_pages is subtracted from the current nr_pages (3) and we get:
          2147484669. This value is saved in a signed integer nr_pages_to_update
      
       5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
          turns into the value of -2147482627
      
       6) As the value is a negative number, in update_pages_handler() it is
          negated and passed to rb_remove_pages() and 2147482627 pages will
          be removed, which is much larger than 3 and it causes the warning
          because not all the pages asked to be removed were removed.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001
      
      Fixes: 7a8e76a3 ("tracing: unified trace buffer")
      Reported-by: default avatarHao Qin <QEver.cn@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      de2c7afc
    • Paul Burton's avatar
      MIPS: Force CPUs to lose FP context during mode switches · dd08b329
      Paul Burton authored
      commit 6b832257 upstream.
      
      Commit 9791554b ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options
      for MIPS") added support for the PR_SET_FP_MODE prctl, which allows a
      userland program to modify its FP mode at runtime. This is most notably
      required if dynamic linking leads to the FP mode requirement changing at
      runtime from that indicated in the initial executable's ELF header. In
      order to avoid overhead in the general FP context restore code, it aimed
      to have threads in the process become unable to enable the FPU during a
      mode switch & have the thread calling the prctl syscall wait for all
      other threads in the process to be context switched at least once. Once
      that happens we can know that no thread in the process whose mode will
      be switched has live FP context, and it's safe to perform the mode
      switch. However in the (rare) case of modeswitches occurring in
      multithreaded programs this can lead to indeterminate delays for the
      thread invoking the prctl syscall, and the code monitoring for those
      context switches was woefully inadequate for all but the simplest cases.
      
      Fix this by broadcasting an IPI if other CPUs may have live FP context
      for an affected thread, with a handler causing those CPUs to relinquish
      their FPU ownership. Threads will then be allowed to continue running
      but will stall on the wait_on_atomic_t in enable_restore_fp_context if
      they attempt to use FP again whilst the mode switch is still in
      progress. The end result is less fragile poking at scheduler context
      switch counts & a more expedient completion of the mode switch.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Fixes: 9791554b ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
      Reviewed-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13145/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      dd08b329
    • Paul Burton's avatar
      MIPS: Disable preemption during prctl(PR_SET_FP_MODE, ...) · 0b1fe3dc
      Paul Burton authored
      commit bd239f1e upstream.
      
      Whilst a PR_SET_FP_MODE prctl is performed there are decisions made
      based upon whether the task is executing on the current CPU. This may
      change if we're preempted, so disable preemption to avoid such changes
      for the lifetime of the mode switch.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Fixes: 9791554b ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
      Reviewed-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Tested-by: default avatarAurelien Jarno <aurelien@aurel32.net>
      Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13144/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0b1fe3dc
    • Maciej W. Rozycki's avatar
      MIPS: ptrace: Prevent writes to read-only FCSR bits · 79b09819
      Maciej W. Rozycki authored
      commit abf378be upstream.
      
      Correct the cases missed with commit 9b26616c ("MIPS: Respect the
      ISA level in FCSR handling") and prevent writes to read-only FCSR bits
      there.
      
      This in particular applies to FP context initialisation where any IEEE
      754-2008 bits preset by `mips_set_personality_nan' are cleared before
      the relevant ptrace(2) call takes effect and the PTRACE_POKEUSR request
      addressing FPC_CSR where no masking of read-only FCSR bits is done.
      
      Remove the FCSR clearing from FP context initialisation then and unify
      PTRACE_POKEUSR/FPC_CSR and PTRACE_SETFPREGS handling, by factoring out
      code from `ptrace_setfpregs' and calling it from both places.
      
      This mostly matters to soft float configurations where the emulator can
      be switched this way to a mode which should not be accessible and cannot
      be set with the CTC1 instruction.  With hard float configurations any
      effect is transient anyway as read-only bits will retain their values at
      the time the FP context is restored.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13239/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      79b09819
    • Maciej W. Rozycki's avatar
      MIPS: ptrace: Fix FP context restoration FCSR regression · 0df22462
      Maciej W. Rozycki authored
      commit 42495484 upstream.
      
      Fix a floating-point context restoration regression introduced with
      commit 9b26616c ("MIPS: Respect the ISA level in FCSR handling")
      that causes a Floating Point exception and consequently a kernel oops
      with hard float configurations when one or more FCSR Enable and their
      corresponding Cause bits are set both at a time via a ptrace(2) call.
      
      To do so reinstate Cause bit masking originally introduced with commit
      b1442d39 ("MIPS: Prevent user from setting FCSR cause bits") to
      address this exact problem and then inadvertently removed from the
      PTRACE_SETFPREGS request with the commit referred above.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13238/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0df22462
    • Paul Burton's avatar
      MIPS: math-emu: Fix jalr emulation when rd == $0 · 4051f5a8
      Paul Burton authored
      commit ab4a92e6 upstream.
      
      When emulating a jalr instruction with rd == $0, the code in
      isBranchInstr was incorrectly writing to GPR $0 which should actually
      always remain zeroed. This would lead to any further instructions
      emulated which use $0 operating on a bogus value until the task is next
      context switched, at which point the value of $0 in the task context
      would be restored to the correct zero by a store in SAVE_SOME. Fix this
      by not writing to rd if it is $0.
      
      Fixes: 102cedc3 ("MIPS: microMIPS: Floating point support.")
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: Maciej W. Rozycki <macro@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13160/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4051f5a8
    • James Hogan's avatar
      MIPS: Fix uapi include in exported asm/siginfo.h · 8f97af21
      James Hogan authored
      commit 987e5b83 upstream.
      
      Since commit 8cb48fe1 ("MIPS: Provide correct siginfo_t.si_stime"),
      MIPS' uapi/asm/siginfo.h has included uapi/asm-generic/siginfo.h
      directly before defining MIPS' struct siginfo, in order to get the
      necessary definitions needed for the siginfo struct without the generic
      copy_siginfo() hitting compiler errors due to struct siginfo not yet
      being defined.
      
      Now that the generic copy_siginfo() is moved out to linux/signal.h we
      can safely include asm-generic/siginfo.h before defining the MIPS
      specific struct siginfo, which avoids the uapi/ include as well as
      breakage due to generic copy_siginfo() being defined before struct
      siginfo.
      Reported-by: default avatarChristopher Ferris <cferris@google.com>
      Fixes: 8cb48fe1 ("MIPS: Provide correct siginfo_t.si_stime")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Petr Malat <oss@malat.biz>
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8f97af21
    • James Hogan's avatar
      SIGNAL: Move generic copy_siginfo() to signal.h · 009bf182
      James Hogan authored
      commit ca9eb49a upstream.
      
      The generic copy_siginfo() is currently defined in
      asm-generic/siginfo.h, after including uapi/asm-generic/siginfo.h which
      defines the generic struct siginfo. However this makes it awkward for an
      architecture to use it if it has to define its own struct siginfo (e.g.
      MIPS and potentially IA64), since it means that asm-generic/siginfo.h
      can only be included after defining the arch-specific siginfo, which may
      be problematic if the arch-specific definition needs definitions from
      uapi/asm-generic/siginfo.h.
      
      It is possible to work around this by first including
      uapi/asm-generic/siginfo.h to get the constants before defining the
      arch-specific siginfo, and include asm-generic/siginfo.h after. However
      uapi headers can't be included by other uapi headers, so that first
      include has to be in an ifdef __kernel__, with the non __kernel__ case
      including the non-UAPI header instead.
      
      Instead of that mess, move the generic copy_siginfo() definition into
      linux/signal.h, which allows an arch-specific uapi/asm/siginfo.h to
      include asm-generic/siginfo.h and define the arch-specific siginfo, and
      for the generic copy_siginfo() to see that arch-specific definition.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Petr Malat <oss@malat.biz>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Christopher Ferris <cferris@google.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/12478/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      009bf182
    • Paul Burton's avatar
      MIPS: Sync icache & dcache in set_pte_at · c1bab318
      Paul Burton authored
      commit 37d22a0d upstream.
      
      It's possible for pages to become visible prior to update_mmu_cache
      running if a thread within the same address space preempts the current
      thread or runs simultaneously on another CPU. That is, the following
      scenario is possible:
      
          CPU0                            CPU1
      
          write to page
          flush_dcache_page
          flush_icache_page
          set_pte_at
                                          map page
          update_mmu_cache
      
      If CPU1 maps the page in between CPU0's set_pte_at, which marks it valid
      & visible, and update_mmu_cache where the dcache flush occurs then CPU1s
      icache will fill from stale data (unless it fills from the dcache, in
      which case all is good, but most MIPS CPUs don't have this property).
      Commit 4d46a67a ("MIPS: Fix race condition in lazy cache flushing.")
      attempted to fix that by performing the dcache flush in
      flush_icache_page such that it occurs before the set_pte_at call makes
      the page visible. However it has the problem that not all code that
      writes to pages exposed to userland call flush_icache_page. There are
      many callers of set_pte_at under mm/ and only 2 of them do call
      flush_icache_page. Thus the race window between a page becoming visible
      & being coherent between the icache & dcache remains open in some cases.
      
      To illustrate some of the cases, a WARN was added to __update_cache with
      this patch applied that triggered in cases where a page about to be
      flushed from the dcache was not the last page provided to
      flush_icache_page. That is, backtraces were obtained for cases in which
      the race window is left open without this patch. The 2 standout examples
      follow.
      
      When forking a process:
      
      [   15.271842] [<80417630>] __update_cache+0xcc/0x188
      [   15.277274] [<80530394>] copy_page_range+0x56c/0x6ac
      [   15.282861] [<8042936c>] copy_process.part.54+0xd40/0x17ac
      [   15.289028] [<80429f80>] do_fork+0xe4/0x420
      [   15.293747] [<80413808>] handle_sys+0x128/0x14c
      
      When exec'ing an ELF binary:
      
      [   14.445964] [<80417630>] __update_cache+0xcc/0x188
      [   14.451369] [<80538d88>] move_page_tables+0x414/0x498
      [   14.457075] [<8055d848>] setup_arg_pages+0x220/0x318
      [   14.462685] [<805b0f38>] load_elf_binary+0x530/0x12a0
      [   14.468374] [<8055ec3c>] search_binary_handler+0xbc/0x214
      [   14.474444] [<8055f6c0>] do_execveat_common+0x43c/0x67c
      [   14.480324] [<8055f938>] do_execve+0x38/0x44
      [   14.485137] [<80413808>] handle_sys+0x128/0x14c
      
      These code paths write into a page, call flush_dcache_page then call
      set_pte_at without flush_icache_page inbetween. The end result is that
      the icache can become corrupted & userland processes may execute
      unexpected or invalid code, typically resulting in a reserved
      instruction exception, a trap or a segfault.
      
      Fix this race condition fully by performing any cache maintenance
      required to keep the icache & dcache in sync in set_pte_at, before the
      page is made valid. This has the added bonus of ensuring the cache
      maintenance always happens in one location, rather than being duplicated
      in flush_icache_page & update_mmu_cache. It also matches the way other
      architectures solve the same problem (see arm, ia64 & powerpc).
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Reported-by: default avatarIonela Voinescu <ionela.voinescu@imgtec.com>
      Cc: Lars Persson <lars.persson@axis.com>
      Fixes: 4d46a67a ("MIPS: Fix race condition in lazy cache flushing.")
      Cc: Steven J. Hill <sjhill@realitydiluted.com>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/12722/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c1bab318
    • Paul Burton's avatar
      MIPS: Handle highmem pages in __update_cache · 3ef36d5c
      Paul Burton authored
      commit f4281bba upstream.
      
      The following patch will expose __update_cache to highmem pages. Handle
      them by mapping them in for the duration of the cache maintenance, just
      like in __flush_dcache_page. The code for that isn't shared because we
      need the page address in __update_cache so sharing became messy. Given
      that the entirity is an extra 5 lines, just duplicate it.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: Lars Persson <lars.persson@axis.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/12721/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3ef36d5c
    • Guilherme G. Piccoli's avatar
      Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" · 6a78d8e9
      Guilherme G. Piccoli authored
      commit c2078d9e upstream.
      
      This reverts commit 89a51df5.
      
      The function eeh_add_device_early() is used to perform EEH
      initialization in devices added later on the system, like in
      hotplug/DLPAR scenarios. Since the commit 89a51df5 ("powerpc/eeh:
      Fix crash in eeh_add_device_early() on Cell") a new check was introduced
      in this function - Cell has no EEH capabilities which led to kernel oops
      if hotplug was performed, so checking for eeh_enabled() was introduced
      to avoid the issue.
      
      However, in architectures that EEH is present like pSeries or PowerNV,
      we might reach a case in which no PCI devices are present on boot time
      and so EEH is not initialized. Then, if a device is added via DLPAR for
      example, eeh_add_device_early() fails because eeh_enabled() is false,
      and EEH end up not being enabled at all.
      
      This reverts the aforementioned patch since a new verification was
      introduced by the commit d91dafc0 ("powerpc/eeh: Delay probing EEH
      device during hotplug") and so the original Cell issue does not happen
      anymore.
      Reviewed-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6a78d8e9
    • Gavin Shan's avatar
      powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() · 3f992124
      Gavin Shan authored
      commit 5a0cdbfd upstream.
      
      The function eeh_pe_reset_and_recover() is used to recover EEH
      error when the passthrou device are transferred to guest and
      backwards. The content in the device's config space will be lost
      on PE reset issued in the middle of the recovery. The function
      saves/restores it before/after the reset. However, config access
      to some adapters like Broadcom BCM5719 at this point will causes
      fenced PHB. The config space is always blocked and we save 0xFF's
      that are restored at late point. The memory BARs are totally
      corrupted, causing another EEH error upon access to one of the
      memory BARs.
      
      This restores the config space on those adapters like BCM5719
      from the content saved to the EEH device when it's populated,
      to resolve above issue.
      
      Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3f992124
    • Gavin Shan's avatar
      powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() · 4bf7e15e
      Gavin Shan authored
      commit affeb0f2 upstream.
      
      The function eeh_pe_reset_and_recover() is used to recover EEH
      error when the passthrough device are transferred to guest and
      backwards, meaning the device's driver is vfio-pci or none.
      When the driver is vfio-pci that provides error_detected() error
      handler only, the handler simply stops the guest and it's not
      expected behaviour. On the other hand, no error handlers will
      be called if we don't have a bound driver.
      
      This ignores the error handler in eeh_pe_reset_and_recover()
      that reports the error to device driver to avoid the exceptional
      behaviour.
      
      Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4bf7e15e
    • Vik Heyndrickx's avatar
      sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems · fada1ef9
      Vik Heyndrickx authored
      commit 20878232 upstream.
      
      Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
      have no load at all.
      
      Uptime and /proc/loadavg on all systems with kernels released during the
      last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
      minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
      idle systems, but the way the kernel calculates this value prevents it
      from getting lower than the mentioned values.
      
      Likewise but not as obviously noticeable, a fully loaded system with no
      processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
      (multiplied by number of cores).
      
      Once the (old) load becomes 93 or higher, it mathematically can never
      get lower than 93, even when the active (load) remains 0 forever.
      This results in the strange 0.00, 0.01, 0.05 uptime values on idle
      systems.  Note: 93/2048 = 0.0454..., which rounds up to 0.05.
      
      It is not correct to add a 0.5 rounding (=1024/2048) here, since the
      result from this function is fed back into the next iteration again,
      so the result of that +0.5 rounding value then gets multiplied by
      (2048-2037), and then rounded again, so there is a virtual "ghost"
      load created, next to the old and active load terms.
      
      By changing the way the internally kept value is rounded, that internal
      value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
      increasing load, the internally kept load value is rounded up, when the
      load is decreasing, the load value is rounded down.
      
      The modified code was tested on nohz=off and nohz kernels. It was tested
      on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
      tested on single, dual, and octal cores system. It was tested on virtual
      hosts and bare hardware. No unwanted effects have been observed, and the
      problems that the patch intended to fix were indeed gone.
      Tested-by: default avatarDamien Wyart <damien.wyart@free.fr>
      Signed-off-by: default avatarVik Heyndrickx <vik.heyndrickx@veribox.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Doug Smythies <dsmythies@telus.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 0f004f5a ("sched: Cure more NO_HZ load average woes")
      Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fada1ef9
    • wang yanqing's avatar
      rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring · 3d3f5d8e
      wang yanqing authored
      commit cf968937 upstream.
      
      We can't use kfree_skb in irq disable context, because spin_lock_irqsave
      make sure we are always in irq disable context, use dev_kfree_skb_irq
      instead of kfree_skb is better than dev_kfree_skb_any.
      
      This patch fix below kernel warning:
      [ 7612.095528] ------------[ cut here ]------------
      [ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150 __local_bh_enable_ip+0x58/0x80()
      [ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal btcoexist rtl_pci rtlwifi rtl8723_common
      [ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: G        W       4.4.0+ #4
      [ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW (1.19 ) 08/27/2015
      [ 7612.095574]  00000000 00000000 da37fc70 c12ce7c5 00000000 da37fca0 c104cc59 c19d4454
      [ 7612.095584]  00000003 0000116c c19d4784 00000096 c10508a8 c10508a8 00000200 c1b42400
      [ 7612.095594]  f29be780 da37fcb0 c104ccad 00000009 00000000 da37fcbc c10508a8 f21f08b8
      [ 7612.095604] Call Trace:
      [ 7612.095614]  [<c12ce7c5>] dump_stack+0x41/0x5c
      [ 7612.095620]  [<c104cc59>] warn_slowpath_common+0x89/0xc0
      [ 7612.095628]  [<c10508a8>] ? __local_bh_enable_ip+0x58/0x80
      [ 7612.095634]  [<c10508a8>] ? __local_bh_enable_ip+0x58/0x80
      [ 7612.095640]  [<c104ccad>] warn_slowpath_null+0x1d/0x20
      [ 7612.095646]  [<c10508a8>] __local_bh_enable_ip+0x58/0x80
      [ 7612.095653]  [<c16b7d34>] destroy_conntrack+0x64/0xa0
      [ 7612.095660]  [<c16b300f>] nf_conntrack_destroy+0xf/0x20
      [ 7612.095665]  [<c1677565>] skb_release_head_state+0x55/0xa0
      [ 7612.095670]  [<c16775bb>] skb_release_all+0xb/0x20
      [ 7612.095674]  [<c167760b>] __kfree_skb+0xb/0x60
      [ 7612.095679]  [<c16776f0>] kfree_skb+0x30/0x70
      [ 7612.095686]  [<f81b869d>] ? rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
      [ 7612.095692]  [<f81b869d>] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
      [ 7612.095698]  [<f81b87f9>] rtl_pci_start+0x19/0x190 [rtl_pci]
      [ 7612.095705]  [<f81970e6>] rtl_op_start+0x56/0x90 [rtlwifi]
      [ 7612.095712]  [<c17e3f16>] drv_start+0x36/0xc0
      [ 7612.095717]  [<c17f5ab3>] ieee80211_do_open+0x2d3/0x890
      [ 7612.095725]  [<c16820fe>] ? call_netdevice_notifiers_info+0x2e/0x60
      [ 7612.095730]  [<c17f60bd>] ieee80211_open+0x4d/0x50
      [ 7612.095736]  [<c16891b3>] __dev_open+0xa3/0x130
      [ 7612.095742]  [<c183fa53>] ? _raw_spin_unlock_bh+0x13/0x20
      [ 7612.095748]  [<c1689499>] __dev_change_flags+0x89/0x140
      [ 7612.095753]  [<c127c70d>] ? selinux_capable+0xd/0x10
      [ 7612.095759]  [<c1689589>] dev_change_flags+0x29/0x60
      [ 7612.095765]  [<c1700b93>] devinet_ioctl+0x553/0x670
      [ 7612.095772]  [<c12db758>] ? _copy_to_user+0x28/0x40
      [ 7612.095777]  [<c17018b5>] inet_ioctl+0x85/0xb0
      [ 7612.095783]  [<c166e647>] sock_ioctl+0x67/0x260
      [ 7612.095788]  [<c166e5e0>] ? sock_fasync+0x80/0x80
      [ 7612.095795]  [<c115c99b>] do_vfs_ioctl+0x6b/0x550
      [ 7612.095800]  [<c127c812>] ? selinux_file_ioctl+0x102/0x1e0
      [ 7612.095807]  [<c10a8914>] ? timekeeping_suspend+0x294/0x320
      [ 7612.095813]  [<c10a256a>] ? __hrtimer_run_queues+0x14a/0x210
      [ 7612.095820]  [<c1276e24>] ? security_file_ioctl+0x34/0x50
      [ 7612.095827]  [<c115cef0>] SyS_ioctl+0x70/0x80
      [ 7612.095832]  [<c1001804>] do_fast_syscall_32+0x84/0x120
      [ 7612.095839]  [<c183ff91>] sysenter_past_esp+0x36/0x55
      [ 7612.095844] ---[ end trace 97e9c637a20e8348 ]---
      Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      [ kamal: backport to 4.2-stable: files moved ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3d3f5d8e
    • wang yanqing's avatar
      rtlwifi: Fix logic error in enter/exit power-save mode · 5d71c372
      wang yanqing authored
      commit 873ffe15 upstream.
      
      In commit a269913c ("rtlwifi: Rework rtl_lps_leave() and
      rtl_lps_enter() to use work queue"), the tests for enter/exit
      power-save mode were inverted. With this change applied, the
      wifi connection becomes much more stable.
      
      Fixes: a269913c ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue")
      Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      [ kamal: backport to 4.2-stable: files moved ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5d71c372
    • Arnd Bergmann's avatar
      kbuild: move -Wunused-const-variable to W=1 warning level · 7490a2cf
      Arnd Bergmann authored
      commit c9c6837d upstream.
      
      gcc-6 started warning by default about variables that are not
      used anywhere and that are marked 'const', generating many
      false positives in an allmodconfig build, e.g.:
      
      arch/arm/mach-davinci/board-da830-evm.c:282:20: warning: 'da830_evm_emif25_pins' defined but not used [-Wunused-const-variable=]
      arch/arm/plat-omap/dmtimer.c:958:34: warning: 'omap_timer_match' defined but not used [-Wunused-const-variable=]
      drivers/bluetooth/hci_bcm.c:625:39: warning: 'acpi_bcm_default_gpios' defined but not used [-Wunused-const-variable=]
      drivers/char/hw_random/omap-rng.c:92:18: warning: 'reg_map_omap4' defined but not used [-Wunused-const-variable=]
      drivers/devfreq/exynos/exynos5_bus.c:381:32: warning: 'exynos5_busfreq_int_pm' defined but not used [-Wunused-const-variable=]
      drivers/dma/mv_xor.c:1139:34: warning: 'mv_xor_dt_ids' defined but not used [-Wunused-const-variable=]
      
      This is similar to the existing -Wunused-but-set-variable warning
      that was added in an earlier release and that we disable by default
      now and only enable when W=1 is set, so it makes sense to do
      the same here. Once we have eliminated the majority of the
      warnings for both, we can put them back into the default list.
      
      We probably want this in backport kernels as well, to allow building
      them with gcc-6 without introducing extra warnings.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarOlof Johansson <olof@lixom.net>
      Acked-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7490a2cf
    • Julien Grall's avatar
      arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str · e1b36a1a
      Julien Grall authored
      commit f228b494 upstream.
      
      The loop that browses the array compat_hwcap_str will stop when a NULL
      is encountered, however NULL is missing at the end of array. This will
      lead to overrun until a NULL is found somewhere in the following memory.
      In reality, this works out because the compat_hwcap2_str array tends to
      follow immediately in memory, and that *is* terminated correctly.
      Furthermore, the unsigned int compat_elf_hwcap is checked before
      printing each capability, so we end up doing the right thing because
      the size of the two arrays is less than 32. Still, this is an obvious
      mistake and should be fixed.
      
      Note for backporting: commit 12d11817 ("arm64: Move
      /proc/cpuinfo handling code") moved this code in v4.4. Prior to that
      commit, the same change should be made in arch/arm64/kernel/setup.c.
      
      Fixes: 44b82b77 "arm64: Fix up /proc/cpuinfo"
      Signed-off-by: default avatarJulien Grall <julien.grall@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      [ kamal: backport to 4.2-stable: applied to setup.c ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e1b36a1a
    • Marc Zyngier's avatar
      irqchip/gic-v3: Configure all interrupts as non-secure Group-1 · 213dd3dc
      Marc Zyngier authored
      commit 7c9b9730 upstream.
      
      The GICv3 driver wrongly assumes that it runs on the non-secure
      side of a secure-enabled system, while it could be on a system
      with a single security state, or a GICv3 with GICD_CTLR.DS set.
      
      Either way, it is important to configure this properly, or
      interrupts will simply not be delivered on this HW.
      Reported-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Tested-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      213dd3dc
    • Will Deacon's avatar
      irqchip/gic: Ensure ordering between read of INTACK and shared data · 3ab2ef33
      Will Deacon authored
      commit f86c4fbd upstream.
      
      When an IPI is generated by a CPU, the pattern looks roughly like:
      
        <write shared data>
        smp_wmb();
        <write to GIC to signal SGI>
      
      On the receiving CPU we rely on the fact that, once we've taken the
      interrupt, then the freshly written shared data must be visible to us.
      Put another way, the CPU isn't going to speculate taking an interrupt.
      
      Unfortunately, this assumption turns out to be broken.
      
      Consider that CPUx wants to send an IPI to CPUy, which will cause CPUy
      to read some shared_data. Before CPUx has done anything, a random
      peripheral raises an IRQ to the GIC and the IRQ line on CPUy is raised.
      CPUy then takes the IRQ and starts executing the entry code, heading
      towards gic_handle_irq. Furthermore, let's assume that a bunch of the
      previous interrupts handled by CPUy were SGIs, so the branch predictor
      kicks in and speculates that irqnr will be <16 and we're likely to
      head into handle_IPI. The prefetcher then grabs a speculative copy of
      shared_data which contains a stale value.
      
      Meanwhile, CPUx gets round to updating shared_data and asking the GIC
      to send an SGI to CPUy. Internally, the GIC decides that the SGI is
      more important than the peripheral interrupt (which hasn't yet been
      ACKed) but doesn't need to do anything to CPUy, because the IRQ line
      is already raised.
      
      CPUy then reads the ACK register on the GIC, sees the SGI value which
      confirms the branch prediction and we end up with a stale shared_data
      value.
      
      This patch fixes the problem by adding an smp_rmb() to the IPI entry
      code in gic_handle_irq. As it turns out, the combination of a control
      dependency and an ISB instruction from the EOI in the GICv3 driver is
      enough to provide the ordering we need, so we add a comment there
      justifying the absence of an explicit smp_rmb().
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3ab2ef33
    • Arnd Bergmann's avatar
      gcov: disable tree-loop-im to reduce stack usage · ec63e311
      Arnd Bergmann authored
      commit c87bf431 upstream.
      
      Enabling CONFIG_GCOV_PROFILE_ALL produces us a lot of warnings like
      
      lib/lz4/lz4hc_compress.c: In function 'lz4_compresshcctx':
      lib/lz4/lz4hc_compress.c:514:1: warning: the frame size of 1504 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      
      After some investigation, I found that this behavior started with gcc-4.9,
      and opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702.
      A suggested workaround for it is to use the -fno-tree-loop-im
      flag that turns off one of the optimization stages in gcc, so the
      code runs a little slower but does not use excessive amounts
      of stack.
      
      We could make this conditional on the gcc version, but I could not
      find an easy way to do this in Kbuild and the benefit would be
      fairly small, given that most of the gcc version in production are
      affected now.
      
      I'm marking this for 'stable' backports because it addresses a bug
      with code generation in gcc that exists in all kernel versions
      with the affected gcc releases.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ec63e311
    • James Hogan's avatar
      MIPS: KVM: Fix timer IRQ race when writing CP0_Compare · b1568de5
      James Hogan authored
      commit b45bacd2 upstream.
      
      Writing CP0_Compare clears the timer interrupt pending bit
      (CP0_Cause.TI), but this wasn't being done atomically. If a timer
      interrupt raced with the write of the guest CP0_Compare, the timer
      interrupt could end up being pending even though the new CP0_Compare is
      nowhere near CP0_Count.
      
      We were already updating the hrtimer expiry with
      kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
      kvm_mips_resume_hrtimer(). Close the race window by expanding out
      kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
      CP0_Compare between the freeze and resume. Since the pending timer
      interrupt should not be cleared when CP0_Compare is written via the KVM
      user API, an ack argument is added to distinguish the source of the
      write.
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim KrčmáÅ" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b1568de5
    • James Hogan's avatar
      MIPS: KVM: Fix timer IRQ race when freezing timer · 780e20ab
      James Hogan authored
      commit 4355c44f upstream.
      
      There's a particularly narrow and subtle race condition when the
      software emulated guest timer is frozen which can allow a guest timer
      interrupt to be missed.
      
      This happens due to the hrtimer expiry being inexact, so very
      occasionally the freeze time will be after the moment when the emulated
      CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
      be generated), but before the moment when the hrtimer is due to expire
      (so no IRQ is generated). The IRQ won't be generated when the timer is
      resumed either, since the resume CP0_Count will already match CP0_Compare.
      
      With VZ guests in particular this is far more likely to happen, since
      the soft timer may be frozen frequently in order to restore the timer
      state to the hardware guest timer. This happens after 5-10 hours of
      guest soak testing, resulting in an overflow in guest kernel timekeeping
      calculations, hanging the guest. A more focussed test case to
      intentionally hit the race (with the help of a new hypcall to cause the
      timer state to migrated between hardware & software) hits the condition
      fairly reliably within around 30 seconds.
      
      Instead of relying purely on the inexact hrtimer expiry to determine
      whether an IRQ should be generated, read the guest CP0_Compare and
      directly check whether the freeze time is before or after it. Only if
      CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
      determine whether the last IRQ has already been generated (which will
      have pushed back the expiry by one timer period).
      
      Fixes: e30492bb ("MIPS: KVM: Rewrite count/compare timer emulation")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim KrčmáÅ" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      780e20ab
    • Catalin Vasile's avatar
      crypto: caam - fix caam_jr_alloc() ret code · 329e7616
      Catalin Vasile authored
      commit e930c765 upstream.
      
      caam_jr_alloc() used to return NULL if a JR device could not be
      allocated for a session. In turn, every user of this function used
      IS_ERR() function to verify if anything went wrong, which does NOT look
      for NULL values. This made the kernel crash if the sanity check failed,
      because the driver continued to think it had allocated a valid JR dev
      instance to the session and at some point it tries to do a caam_jr_free()
      on a NULL JR dev pointer.
      This patch is a fix for this issue.
      Signed-off-by: default avatarCatalin Vasile <cata.vasile@nxp.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      329e7616