1. 05 Oct, 2017 12 commits
    • Shaohua Li's avatar
      md/raid5: fix a race condition in stripe batch · 648798cc
      Shaohua Li authored
      commit 3664847d upstream.
      
      We have a race condition in below scenario, say have 3 continuous stripes, sh1,
      sh2 and sh3, sh1 is the stripe_head of sh2 and sh3:
      
      CPU1				CPU2				CPU3
      handle_stripe(sh3)
      				stripe_add_to_batch_list(sh3)
      				-> lock(sh2, sh3)
      				-> lock batch_lock(sh1)
      				-> add sh3 to batch_list of sh1
      				-> unlock batch_lock(sh1)
      								clear_batch_ready(sh1)
      								-> lock(sh1) and batch_lock(sh1)
      								-> clear STRIPE_BATCH_READY for all stripes in batch_list
      								-> unlock(sh1) and batch_lock(sh1)
      ->clear_batch_ready(sh3)
      -->test_and_clear_bit(STRIPE_BATCH_READY, sh3)
      --->return 0 as sh->batch == NULL
      				-> sh3->batch_head = sh1
      				-> unlock (sh2, sh3)
      
      In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list
      of sh1. By moving sh3->batch_head assignment in to batch_lock, we make it
      impossible to clear STRIPE_BATCH_READY before batch_head is set.
      
      Thanks Stephane for helping debug this tricky issue.
      Reported-and-tested-by: default avatarStephane Thiell <sthiell@stanford.edu>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      648798cc
    • Bo Yan's avatar
      tracing: Erase irqsoff trace with empty write · 5fb4be27
      Bo Yan authored
      commit 8dd33bcb upstream.
      
      One convenient way to erase trace is "echo > trace". However, this
      is currently broken if the current tracer is irqsoff tracer. This
      is because irqsoff tracer use max_buffer as the default trace
      buffer.
      
      Set the max_buffer as the one to be cleared when it's the trace
      buffer currently in use.
      
      Link: http://lkml.kernel.org/r/1505754215-29411-1-git-send-email-byan@nvidia.com
      
      Cc: <mingo@redhat.com>
      Fixes: 4acd4d00 ("tracing: give easy way to clear trace buffer")
      Signed-off-by: default avatarBo Yan <byan@nvidia.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5fb4be27
    • Tahsin Erdogan's avatar
      tracing: Fix trace_pipe behavior for instance traces · 97d402e6
      Tahsin Erdogan authored
      commit 75df6e68 upstream.
      
      When reading data from trace_pipe, tracing_wait_pipe() performs a
      check to see if tracing has been turned off after some data was read.
      Currently, this check always looks at global trace state, but it
      should be checking the trace instance where trace_pipe is located at.
      
      Because of this bug, cat instances/i1/trace_pipe in the following
      script will immediately exit instead of waiting for data:
      
      cd /sys/kernel/debug/tracing
      echo 0 > tracing_on
      mkdir -p instances/i1
      echo 1 > instances/i1/tracing_on
      echo 1 > instances/i1/events/sched/sched_process_exec/enable
      cat instances/i1/trace_pipe
      
      Link: http://lkml.kernel.org/r/20170917102348.1615-1-tahsin@google.com
      
      Fixes: 10246fa3 ("tracing: give easy way to clear trace buffer")
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97d402e6
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list · 8dcf70ab
      Paul Mackerras authored
      commit edd03602 upstream.
      
      Al Viro pointed out that while one thread of a process is executing
      in kvm_vm_ioctl_create_spapr_tce(), another thread could guess the
      file descriptor returned by anon_inode_getfd() and close() it before
      the first thread has added it to the kvm->arch.spapr_tce_tables list.
      That highlights a more general problem: there is no mutual exclusion
      between writers to the spapr_tce_tables list, leading to the
      possibility of the list becoming corrupted, which could cause a
      host kernel crash.
      
      To fix the mutual exclusion problem, we add a mutex_lock/unlock
      pair around the list_del_rce in kvm_spapr_tce_release().
      
      If another thread does guess the file descriptor returned by the
      anon_inode_getfd() call in kvm_vm_ioctl_create_spapr_tce() and closes
      it, its call to kvm_spapr_tce_release() will not do any harm because
      it will have to wait until the first thread has released kvm->lock.
      
      The other things that the second thread could do with the guessed
      file descriptor are to mmap it or to pass it as a parameter to a
      KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl on a KVM device fd.  An mmap
      call won't cause any harm because kvm_spapr_tce_mmap() and
      kvm_spapr_tce_fault() don't access the spapr_tce_tables list or
      the kvmppc_spapr_tce_table.list field, and the fields that they do use
      have been properly initialized by the time of the anon_inode_getfd()
      call.
      
      The KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl calls
      kvm_spapr_tce_attach_iommu_group(), which scans the spapr_tce_tables
      list looking for the kvmppc_spapr_tce_table struct corresponding to
      the fd given as the parameter.  Either it will find the new entry
      or it won't; if it doesn't, it just returns an error, and if it
      does, it will function normally.  So, in each case there is no
      harmful effect.
      
      [paulus@ozlabs.org - moved parts of the upstream patch into the backport
       of 47c5310a, adjusted this commit message accordingly.]
      
      Fixes: 366baf28 ("KVM: PPC: Use RCU for arch.spapr_tce_tables")
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dcf70ab
    • Paul Mackerras's avatar
      KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() · 18b7919a
      Paul Mackerras authored
      commit 47c5310a upstream, with part
      of commit edd03602 folded in.
      
      Nixiaoming pointed out that there is a memory leak in
      kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd()
      fails; the memory allocated for the kvmppc_spapr_tce_table struct
      is not freed, and nor are the pages allocated for the iommu
      tables.  In addition, we have already incremented the process's
      count of locked memory pages, and this doesn't get restored on
      error.
      
      David Hildenbrand pointed out that there is a race in that the
      function checks early on that there is not already an entry in the
      stt->iommu_tables list with the same LIOBN, but an entry with the
      same LIOBN could get added between then and when the new entry is
      added to the list.
      
      This fixes all three problems.  To simplify things, we now call
      anon_inode_getfd() before placing the new entry in the list.  The
      check for an existing entry is done while holding the kvm->lock
      mutex, immediately before adding the new entry to the list.
      Finally, on failure we now call kvmppc_account_memlimit to
      decrement the process's count of locked memory pages.
      
      [paulus@ozlabs.org - folded in that part of edd03602 ("KVM:
       PPC: Book3S HV: Protect updates to spapr_tce_tables list", 2017-08-28)
       which restructured the code that 47c5310a modified, to avoid
       a build failure caused by the absence of put_unused_fd().]
      
      Fixes: 54738c09 ("KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode")
      Fixes: f8626985 ("KVM: PPC: Account TCE-containing pages in locked_vm")
      Reported-by: default avatarNixiaoming <nixiaoming@huawei.com>
      Reported-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18b7919a
    • Thomas Gleixner's avatar
      genirq: Make sparse_irq_lock protect what it should protect · 3d5960c8
      Thomas Gleixner authored
      commit 12ac1d0f upstream.
      
      for_each_active_irq() iterates the sparse irq allocation bitmap. The caller
      must hold sparse_irq_lock. Several code pathes expect that an active bit in
      the sparse bitmap also has a valid interrupt descriptor.
      
      Unfortunately that's not true. The (de)allocation is a two step process,
      which holds the sparse_irq_lock only across the queue/remove from the radix
      tree and the set/clear in the allocation bitmap.
      
      If a iteration locks sparse_irq_lock between the two steps, then it might
      see an active bit but the corresponding irq descriptor is NULL. If that is
      dereferenced unconditionally, then the kernel oopses. Of course, all
      iterator sites could be audited and fixed, but....
      
      There is no reason why the sparse_irq_lock needs to be dropped between the
      two steps, in fact the code becomes simpler when the mutex is held across
      both and the semantics become more straight forward, so future problems of
      missing NULL pointer checks in the iteration are avoided and all existing
      sites are fixed in one go.
      
      Expand the lock held sections so both operations are covered and the bitmap
      and the radixtree are in sync.
      
      Fixes: a05a900a ("genirq: Make sparse_lock a mutex")
      Reported-and-tested-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d5960c8
    • Avraham Stern's avatar
      mac80211: flush hw_roc_start work before cancelling the ROC · e167b4ad
      Avraham Stern authored
      commit 6e46d8ce upstream.
      
      When HW ROC is supported it is possible that after the HW notified
      that the ROC has started, the ROC was cancelled and another ROC was
      added while the hw_roc_start worker is waiting on the mutex (since
      cancelling the ROC and adding another one also holds the same mutex).
      As a result, the hw_roc_start worker will continue to run after the
      new ROC is added but before it is actually started by the HW.
      This may result in notifying userspace that the ROC has started before
      it actually does, or in case of management tx ROC, in an attempt to
      tx while not on the right channel.
      
      In addition, when the driver will notify mac80211 that the second ROC
      has started, mac80211 will warn that this ROC has already been
      notified.
      
      Fix this by flushing the hw_roc_start work before cancelling an ROC.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAvraham Stern <avraham.stern@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e167b4ad
    • Beni Lev's avatar
      mac80211_hwsim: Use proper TX power · e7e0f0dd
      Beni Lev authored
      commit 9de981f5 upstream.
      
      In struct ieee80211_tx_info, control.vif pointer and rate_driver_data[0]
      falls on the same place, depending on the union usage.
      During the whole TX process, the union is referred to as a control struct,
      which holds the vif that is later used in the tx flow, especially in order
      to derive the used tx power.
      Referring direcly to rate_driver_data[0] and assigning a value to it,
      overwrites the vif pointer, hence making all later references irrelevant.
      Moreover, rate_driver_data[0] isn't used later in the flow in order to
      retrieve the channel that it is pointing to.
      Signed-off-by: default avatarBeni Lev <beni.lev@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7e0f0dd
    • Johannes Berg's avatar
      mac80211: fix VLAN handling with TXQs · 59862b04
      Johannes Berg authored
      commit 53168215 upstream.
      
      With TXQs, the AP_VLAN interfaces are resolved to their owner AP
      interface when enqueuing the frame, which makes sense since the
      frame really goes out on that as far as the driver is concerned.
      
      However, this introduces a problem: frames to be encrypted with
      a VLAN-specific GTK will now be encrypted with the AP GTK, since
      the information about which virtual interface to use to select
      the key is taken from the TXQ.
      
      Fix this by preserving info->control.vif and using that in the
      dequeue function. This now requires doing the driver-mapping
      in the dequeue as well.
      
      Since there's no way to filter the frames that are sitting on a
      TXQ, drop all frames, which may affect other interfaces, when an
      AP_VLAN is removed.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59862b04
    • John Ogness's avatar
      fs/proc: Report eip/esp in /prod/PID/stat for coredumping · 9ad15a25
      John Ogness authored
      commit fd7d5627 upstream.
      
      Commit 0a1eb2d4 ("fs/proc: Stop reporting eip and esp in
      /proc/PID/stat") stopped reporting eip/esp because it is
      racy and dangerous for executing tasks. The comment adds:
      
          As far as I know, there are no use programs that make any
          material use of these fields, so just get rid of them.
      
      However, existing userspace core-dump-handler applications (for
      example, minicoredumper) are using these fields since they
      provide an excellent cross-platform interface to these valuable
      pointers. So that commit introduced a user space visible
      regression.
      
      Partially revert the change and make the readout possible for
      tasks with the proper permissions and only if the target task
      has the PF_DUMPCORE flag set.
      
      Fixes: 0a1eb2d4 ("fs/proc: Stop reporting eip and esp in> /proc/PID/stat")
      Reported-by: default avatarMarco Felsch <marco.felsch@preh.de>
      Signed-off-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Tycho Andersen <tycho.andersen@canonical.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Linux API <linux-api@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/87poatfwg6.fsf@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ad15a25
    • Shu Wang's avatar
      cifs: release auth_key.response for reconnect. · b6a77c7b
      Shu Wang authored
      commit f5c4ba81 upstream.
      
      There is a race that cause cifs reconnect in cifs_mount,
      - cifs_mount
        - cifs_get_tcp_session
          - [ start thread cifs_demultiplex_thread
            - cifs_read_from_socket: -ECONNABORTED
              - DELAY_WORK smb2_reconnect_server ]
        - cifs_setup_session
        - [ smb2_reconnect_server ]
      
      auth_key.response was allocated in cifs_setup_session, and
      will release when the session destoried. So when session re-
      connect, auth_key.response should be check and released.
      
      Tested with my system:
      CIFS VFS: Free previous auth_key.response = ffff8800320bbf80
      
      A simple auth_key.response allocation call trace:
      - cifs_setup_session
      - SMB2_sess_setup
      - SMB2_sess_auth_rawntlmssp_authenticate
      - build_ntlmssp_auth_blob
      - setup_ntlmv2_rsp
      Signed-off-by: default avatarShu Wang <shuwang@redhat.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6a77c7b
    • Shu Wang's avatar
      cifs: release cifs root_cred after exit_cifs · 9a7bc3f0
      Shu Wang authored
      commit 94183331 upstream.
      
      memory leak was found by kmemleak. exit_cifs_spnego
      should be called before cifs module removed, or
      cifs root_cred will not be released.
      
      kmemleak report:
      unreferenced object 0xffff880070a3ce40 (size 192):
        backtrace:
           kmemleak_alloc+0x4a/0xa0
           kmem_cache_alloc+0xc7/0x1d0
           prepare_kernel_cred+0x20/0x120
           init_cifs_spnego+0x2d/0x170 [cifs]
           0xffffffffc07801f3
           do_one_initcall+0x51/0x1b0
           do_init_module+0x60/0x1fd
           load_module+0x161e/0x1b60
           SYSC_finit_module+0xa9/0x100
           SyS_finit_module+0xe/0x10
      Signed-off-by: default avatarShu Wang <shuwang@redhat.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a7bc3f0
  2. 27 Sep, 2017 28 commits