1. 09 Apr, 2015 40 commits
    • Dave Hansen's avatar
      ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segments · a540d0b8
      Dave Hansen authored
      commit d3c97900 upstream.
      
      This is a highly-contrived scenario.  But, a single shmdt() call can be
      induced in to unmapping memory from mulitple shm segments.  Example code
      is here:
      
      	http://www.sr71.net/~dave/intel/shmfun.c
      
      The fix is pretty simple: Record the 'struct file' for the first VMA we
      encounter and then stick to it.  Decline to unmap anything not from the
      same file and thus the same segment.
      
      I found this by inspection and the odds of anyone hitting this in practice
      are pretty darn small.
      
      Lightly tested, but it's a pretty small patch.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Reviewed-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a540d0b8
    • Manfred Spraul's avatar
      ipc/sem.c: fully initialize sem_array before making it visible · dd06a962
      Manfred Spraul authored
      commit e8577d1f upstream.
      
      ipc_addid() makes a new ipc identifier visible to everyone.  New objects
      start as locked, so that the caller can complete the initialization
      after the call.  Within struct sem_array, at least sma->sem_base and
      sma->sem_nsems are accessed without any locks, therefore this approach
      doesn't work.
      
      Thus: Move the ipc_addid() to the end of the initialization.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Reported-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dd06a962
    • David Jeffery's avatar
      Don't leak a key reference if request_key() tries to use a revoked keyring · da95a0de
      David Jeffery authored
      commit d0709f1e upstream.
      
      If a request_key() call to allocate and fill out a key attempts to insert the
      key structure into a revoked keyring, the key will leak, using memory and part
      of the user's key quota until the system reboots. This is from a failure of
      construct_alloc_key() to decrement the key's reference count after the attempt
      to insert into the requested keyring is rejected.
      
      key_put() needs to be called in the link_prealloc_failed callpath to ensure
      the unused key is released.
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      da95a0de
    • Michael Wang's avatar
      power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update · c4b022d1
      Michael Wang authored
      commit 9a013361 upstream.
      
      Since v1:
      	Edited the comment according to Srivatsa's suggestion.
      
      During the testing, we encounter below WARN followed by Oops:
      
      	WARNING: at kernel/sched/core.c:6218
      	...
      	NIP [c000000000101660] .build_sched_domains+0x11d0/0x1200
      	LR [c000000000101358] .build_sched_domains+0xec8/0x1200
      	PACATMSCRATCH [800000000000f032]
      	Call Trace:
      	[c00000001b103850] [c000000000101358] .build_sched_domains+0xec8/0x1200
      	[c00000001b1039a0] [c00000000010aad4] .partition_sched_domains+0x484/0x510
      	[c00000001b103aa0] [c00000000016d0a8] .rebuild_sched_domains+0x68/0xa0
      	[c00000001b103b30] [c00000000005cbf0] .topology_work_fn+0x10/0x30
      	...
      	Oops: Kernel access of bad area, sig: 11 [#1]
      	...
      	NIP [c00000000045c000] .__bitmap_weight+0x60/0xf0
      	LR [c00000000010132c] .build_sched_domains+0xe9c/0x1200
      	PACATMSCRATCH [8000000000029032]
      	Call Trace:
      	[c00000001b1037a0] [c000000000288ff4] .kmem_cache_alloc_node_trace+0x184/0x3a0
      	[c00000001b103850] [c00000000010132c] .build_sched_domains+0xe9c/0x1200
      	[c00000001b1039a0] [c00000000010aad4] .partition_sched_domains+0x484/0x510
      	[c00000001b103aa0] [c00000000016d0a8] .rebuild_sched_domains+0x68/0xa0
      	[c00000001b103b30] [c00000000005cbf0] .topology_work_fn+0x10/0x30
      	...
      
      This was caused by that 'sd->groups == NULL' after building groups, which
      was caused by the empty 'sd->span'.
      
      The cpu's domain contained nothing because the cpu was assigned to a wrong
      node, due to the following unfortunate sequence of events:
      
      1. The hypervisor sent a topology update to the guest OS, to notify changes
         to the cpu-node mapping. However, the update was actually redundant - i.e.,
         the "new" mapping was exactly the same as the old one.
      
      2. Due to this, the 'updated_cpus' mask turned out to be empty after exiting
         the 'for-loop' in arch_update_cpu_topology().
      
      3. So we ended up calling stop-machine() with an empty cpumask list, which made
         stop-machine internally elect cpumask_first(cpu_online_mask), i.e., CPU0 as
         the cpu to run the payload (the update_cpu_topology() function).
      
      4. This causes update_cpu_topology() to be run by CPU0. And since 'updates'
         is kzalloc()'ed inside arch_update_cpu_topology(), update_cpu_topology()
         finds update->cpu as well as update->new_nid to be 0. In other words, we
         end up assigning CPU0 (and eventually its siblings) to node 0, incorrectly.
      
      Along with the following wrong updating, it causes the sched-domain rebuild
      code to break and crash the system.
      
      Fix this by skipping the topology update in cases where we find that
      the topology has not actually changed in reality (ie., spurious updates).
      
      CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      CC: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      CC: Stephen Rothwell <sfr@canb.auug.org.au>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Robert Jennings <rcj@linux.vnet.ibm.com>
      CC: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
      CC: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      CC: Alistair Popple <alistair@popple.id.au>
      Suggested-by: default avatar"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Wang <wangyun@linux.vnet.ibm.com>
      Reviewed-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c4b022d1
    • Lee Duncan's avatar
      target: Allow Write Exclusive non-reservation holders to READ · e007687c
      Lee Duncan authored
      commit 1ecc7586 upstream.
      
      For PGR reservation of type Write Exclusive Access, allow all non
      reservation holding I_T nexuses with active registrations to READ
      from the device.
      
      This addresses a bug where active registrations that attempted
      to READ would result in an reservation conflict.
      Signed-off-by: default avatarLee Duncan <lduncan@suse.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e007687c
    • Nicholas Bellinger's avatar
      target: Allow AllRegistrants to re-RESERVE existing reservation · 84d0034c
      Nicholas Bellinger authored
      commit ae450e24 upstream.
      
      This patch changes core_scsi3_pro_release() logic to allow an
      existing AllRegistrants type reservation to be re-reserved by
      any registered I_T nexus.
      
      This addresses a issue where AllRegistrants type RESERVE was
      receiving RESERVATION_CONFLICT status if dev_pr_res_holder did
      not match the same I_T nexus, instead of just returning GOOD
      status following spc4r34 Section 5.9.9:
      
      "If the device server receives a PERSISTENT RESERVE OUT command
       with RESERVE service action where the TYPE field and the SCOPE
       field contain the same values as the existing type and scope
       from a persistent reservation holder, it shall not make any
       change to the existing persistent reservation and shall complete
       the command with GOOD status."
      Reported-by: default avatarIlias Tsitsimpis <i.tsitsimpis@gmail.com>
      Cc: Ilias Tsitsimpis <i.tsitsimpis@gmail.com>
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      84d0034c
    • Nicholas Bellinger's avatar
      target: Avoid dropping AllRegistrants reservation during unregister · 24987ee3
      Nicholas Bellinger authored
      commit 6c3c9baa upstream.
      
      This patch fixes an issue with AllRegistrants reservations where
      an unregister operation by the I_T nexus reservation holder would
      incorrectly drop the reservation, instead of waiting until the
      last active I_T nexus is unregistered as per SPC-4.
      
      This includes updating __core_scsi3_complete_pro_release() to reset
      dev->dev_pr_res_holder with another pr_reg for this special case,
      as well as a new 'unreg' parameter to determine when the release
      is occuring from an implicit unregister, vs. explicit RELEASE.
      
      It also adds special handling in core_scsi3_free_pr_reg_from_nacl()
      to release the left-over pr_res_holder, now that pr_reg is deleted
      from pr_reg_list within __core_scsi3_complete_pro_release().
      Reported-by: default avatarIlias Tsitsimpis <i.tsitsimpis@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      24987ee3
    • Nicholas Bellinger's avatar
      target: Fix R_HOLDER bit usage for AllRegistrants · fcc609e5
      Nicholas Bellinger authored
      commit d16ca7c5 upstream.
      
      This patch fixes the usage of R_HOLDER bit for an All Registrants
      reservation in READ_FULL_STATUS, where only the registration who
      issued RESERVE was being reported as having an active reservation.
      
      It changes core_scsi3_pri_read_full_status() to check ahead of the
      list walk of active registrations to see if All Registrants is active,
      and if so set R_HOLDER bit and scope/type fields for all active
      registrations.
      Reported-by: default avatarIlias Tsitsimpis <i.tsitsimpis@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fcc609e5
    • Nicholas Bellinger's avatar
      target/pscsi: Fix NULL pointer dereference in get_device_type · d8e73127
      Nicholas Bellinger authored
      commit 215a8fe4 upstream.
      
      This patch fixes a NULL pointer dereference OOPs with pSCSI backends
      within target_core_stat.c code.  The bug is caused by a configfs attr
      read if no pscsi_dev_virt->pdv_sd has been configured.
      Reported-by: default avatarOlaf Hering <olaf@aepfle.de>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d8e73127
    • Nicholas Bellinger's avatar
      iscsi-target: Avoid early conn_logout_comp for iser connections · 2ddc09fb
      Nicholas Bellinger authored
      commit f068fbc8 upstream.
      
      This patch fixes a iser specific logout bug where early complete()
      of conn->conn_logout_comp in iscsit_close_connection() was causing
      isert_wait4logout() to complete too soon, triggering a use after
      free NULL pointer dereference of iscsi_conn memory.
      
      The complete() was originally added for traditional iscsi-target
      when a ISCSI_LOGOUT_OP failed in iscsi_target_rx_opcode(), but given
      iser-target does not wait in logout failure, this special case needs
      to be avoided.
      Reported-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Cc: Slava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2ddc09fb
    • Nicholas Bellinger's avatar
      target: Fix virtual LUN=0 target_configure_device failure OOPs · 033acbf6
      Nicholas Bellinger authored
      commit 5f7da044 upstream.
      
      This patch fixes a NULL pointer dereference triggered by a late
      target_configure_device() -> alloc_workqueue() failure that results
      in target_free_device() being called with DF_CONFIGURED already set,
      which subsequently OOPses in destroy_workqueue() code.
      
      Currently this only happens at modprobe target_core_mod time when
      core_dev_setup_virtual_lun0() -> target_configure_device() fails,
      and the explicit target_free_device() gets called.
      
      To address this bug originally introduced by commit 0fd97ccf, go
      ahead and move DF_CONFIGURED to end of target_configure_device()
      code to handle this special failure case.
      Reported-by: default avatarClaudio Fleiner <cmf@daterainc.com>
      Cc: Claudio Fleiner <cmf@daterainc.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      033acbf6
    • Bart Van Assche's avatar
      target: Fix reference leak in target_get_sess_cmd() error path · 64e6cd75
      Bart Van Assche authored
      commit 7544e597 upstream.
      
      This patch fixes a se_cmd->cmd_kref leak buf when se_sess->sess_tearing_down
      is true within target_get_sess_cmd() submission path code.
      
      This se_cmd reference leak can occur during active session shutdown when
      ack_kref=1 is passed by target_submit_cmd_[map_sgls,tmr]() callers.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      64e6cd75
    • Alexandre Belloni's avatar
      ARM: at91: pm: fix at91rm9200 standby · fa95ae25
      Alexandre Belloni authored
      commit 84e87166 upstream.
      
      at91rm9200 standby and suspend to ram has been broken since
      00482a40. It is wrongly using AT91_BASE_SYS which is a physical address
      and actually doesn't correspond to any register on at91rm9200.
      
      Use the correct at91_ramc_base[0] instead.
      
      Fixes: 00482a40 (ARM: at91: implement the standby function for pm/cpuidle)
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fa95ae25
    • Jiri Slaby's avatar
      x86/vdso: Fix the build on GCC5 · eda3a683
      Jiri Slaby authored
      commit e8932869 upstream.
      
      On gcc5 the kernel does not link:
      
        ld: .eh_frame_hdr table[4] FDE at 0000000000000648 overlaps table[5] FDE at 0000000000000670.
      
      Because prior GCC versions always emitted NOPs on ALIGN directives, but
      gcc5 started omitting them.
      
      .LSTARTFDEDLSI1 says:
      
              /* HACK: The dwarf2 unwind routines will subtract 1 from the
                 return address to get an address in the middle of the
                 presumed call instruction.  Since we didn't get here via
                 a call, we need to include the nop before the real start
                 to make up for it.  */
              .long .LSTART_sigreturn-1-.     /* PC-relative start address */
      
      But commit 69d0627a ("x86 vDSO: reorder vdso32 code") from 2.6.25
      replaced .org __kernel_vsyscall+32,0x90 by ALIGN right before
      __kernel_sigreturn.
      
      Of course, ALIGN need not generate any NOP in there. Esp. gcc5 collapses
      vclock_gettime.o and int80.o together with no generated NOPs as "ALIGN".
      
      So fix this by adding to that point at least a single NOP and make the
      function ALIGN possibly with more NOPs then.
      
      Kudos for reporting and diagnosing should go to Richard.
      Reported-by: default avatarRichard Biener <rguenther@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1425543211-12542-1-git-send-email-jslaby@suse.czSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eda3a683
    • Oleg Nesterov's avatar
      x86/fpu: Drop_fpu() should not assume that tsk equals current · 63a22cd2
      Oleg Nesterov authored
      commit f4c36863 upstream.
      
      drop_fpu() does clear_used_math() and usually this is correct
      because tsk == current.
      
      However switch_fpu_finish()->restore_fpu_checking() is called before
      __switch_to() updates the "current_task" variable. If it fails,
      we will wrongly clear the PF_USED_MATH flag of the previous task.
      
      So use clear_stopped_child_used_math() instead.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pekka Riikonen <priikone@iki.fi>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      63a22cd2
    • Oleg Nesterov's avatar
      x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig() · c382613c
      Oleg Nesterov authored
      commit a7c80ebc upstream.
      
      math_state_restore() assumes it is called with irqs disabled,
      but this is not true if the caller is __restore_xstate_sig().
      
      This means that if ia32_fxstate == T and __copy_from_user()
      fails, __restore_xstate_sig() returns with irqs disabled too.
      
      This triggers:
      
        BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
         dump_stack
         ___might_sleep
         ? _raw_spin_unlock_irqrestore
         __might_sleep
         down_read
         ? _raw_spin_unlock_irqrestore
         print_vma_addr
         signal_fault
         sys32_rt_sigreturn
      
      Change __restore_xstate_sig() to call set_used_math()
      unconditionally. This avoids enabling and disabling interrupts
      in math_state_restore(). If copy_from_user() fails, we can
      simply do fpu_finit() by hand.
      
      [ Note: this is only the first step. math_state_restore() should
              not check used_math(), it should set this flag. While
      	init_fpu() should simply die. ]
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pekka Riikonen <priikone@iki.fi>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c382613c
    • Stephan Mueller's avatar
      crypto: aesni - fix memory usage in GCM decryption · 0585664d
      Stephan Mueller authored
      commit ccfe8c3f upstream.
      
      The kernel crypto API logic requires the caller to provide the
      length of (ciphertext || authentication tag) as cryptlen for the
      AEAD decryption operation. Thus, the cipher implementation must
      calculate the size of the plaintext output itself and cannot simply use
      cryptlen.
      
      The RFC4106 GCM decryption operation tries to overwrite cryptlen memory
      in req->dst. As the destination buffer for decryption only needs to hold
      the plaintext memory but cryptlen references the input buffer holding
      (ciphertext || authentication tag), the assumption of the destination
      buffer length in RFC4106 GCM operation leads to a too large size. This
      patch simply uses the already calculated plaintext size.
      
      In addition, this patch fixes the offset calculation of the AAD buffer
      pointer: as mentioned before, cryptlen already includes the size of the
      tag. Thus, the tag does not need to be added. With the addition, the AAD
      will be written beyond the already allocated buffer.
      
      Note, this fixes a kernel crash that can be triggered from user space
      via AF_ALG(aead) -- simply use the libkcapi test application
      from [1] and update it to use rfc4106-gcm-aes.
      
      Using [1], the changes were tested using CAVS vectors to demonstrate
      that the crypto operation still delivers the right results.
      
      [1] http://www.chronox.de/libkcapi.html
      
      CC: Tadeusz Struk <tadeusz.struk@intel.com>
      Signed-off-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0585664d
    • Kirill A. Shutemov's avatar
      pagemap: do not leak physical addresses to non-privileged userspace · 91e9d770
      Kirill A. Shutemov authored
      commit ab676b7d upstream.
      
      As pointed by recent post[1] on exploiting DRAM physical imperfection,
      /proc/PID/pagemap exposes sensitive information which can be used to do
      attacks.
      
      This disallows anybody without CAP_SYS_ADMIN to read the pagemap.
      
      [1] http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
      
      [ Eventually we might want to do anything more finegrained, but for now
        this is the simple model.   - Linus ]
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Mark Seaborn <mseaborn@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      91e9d770
    • James Bottomley's avatar
      libsas: Fix Kernel Crash in smp_execute_task · 74101162
      James Bottomley authored
      commit 6302ce4d upstream.
      
      This crash was reported:
      
      [  366.947370] sd 3:0:1:0: [sdb] Spinning up disk....
      [  368.804046] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [  368.804072] IP: [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
      [  368.804098] PGD 0
      [  368.804114] Oops: 0002 [#1] SMP
      [  368.804143] CPU 1
      [  368.804151] Modules linked in: sg netconsole s3g(PO) uinput joydev hid_multitouch usbhid hid snd_hda_codec_via cpufreq_userspace cpufreq_powersave cpufreq_stats uhci_hcd cpufreq_conservative snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sdhci_pci snd_page_alloc sdhci snd_timer snd psmouse evdev serio_raw pcspkr soundcore xhci_hcd shpchp s3g_drm(O) mvsas mmc_core ahci libahci drm i2c_core acpi_cpufreq mperf video processor button thermal_sys dm_dmirror exfat_fs exfat_core dm_zcache dm_mod padlock_aes aes_generic padlock_sha iscsi_target_mod target_core_mod configfs sswipe libsas libata scsi_transport_sas picdev via_cputemp hwmon_vid fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sd_mod crc_t10dif usb_storage scsi_mod ehci_hcd usbcore usb_common
      [  368.804749]
      [  368.804764] Pid: 392, comm: kworker/u:3 Tainted: P        W  O 3.4.87-logicube-ng.22 #1 To be filled by O.E.M. To be filled by O.E.M./EPIA-M920
      [  368.804802] RIP: 0010:[<ffffffff81358457>]  [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
      [  368.804827] RSP: 0018:ffff880117001cc0  EFLAGS: 00010246
      [  368.804842] RAX: 0000000000000000 RBX: ffff8801185030d0 RCX: ffff88008edcb420
      [  368.804857] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8801185030d4
      [  368.804873] RBP: ffff8801181531c0 R08: 0000000000000020 R09: 00000000fffffffe
      [  368.804885] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801185030d4
      [  368.804899] R13: 0000000000000002 R14: ffff880117001fd8 R15: ffff8801185030d8
      [  368.804916] FS:  0000000000000000(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
      [  368.804931] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  368.804946] CR2: 0000000000000000 CR3: 000000000160b000 CR4: 00000000000006e0
      [  368.804962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  368.804978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  368.804995] Process kworker/u:3 (pid: 392, threadinfo ffff880117000000, task ffff8801181531c0)
      [  368.805009] Stack:
      [  368.805017]  ffff8801185030d8 0000000000000000 ffffffff8161ddf0 ffffffff81056f7c
      [  368.805062]  000000000000b503 ffff8801185030d0 ffff880118503000 0000000000000000
      [  368.805100]  ffff8801185030d0 ffff8801188b8000 ffff88008edcb420 ffffffff813583ac
      [  368.805135] Call Trace:
      [  368.805153]  [<ffffffff81056f7c>] ? up+0xb/0x33
      [  368.805168]  [<ffffffff813583ac>] ? mutex_lock+0x16/0x25
      [  368.805194]  [<ffffffffa018c414>] ? smp_execute_task+0x4e/0x222 [libsas]
      [  368.805217]  [<ffffffffa018ce1c>] ? sas_find_bcast_dev+0x3c/0x15d [libsas]
      [  368.805240]  [<ffffffffa018ce4f>] ? sas_find_bcast_dev+0x6f/0x15d [libsas]
      [  368.805264]  [<ffffffffa018e989>] ? sas_ex_revalidate_domain+0x37/0x2ec [libsas]
      [  368.805280]  [<ffffffff81355a2a>] ? printk+0x43/0x48
      [  368.805296]  [<ffffffff81359a65>] ? _raw_spin_unlock_irqrestore+0xc/0xd
      [  368.805318]  [<ffffffffa018b767>] ? sas_revalidate_domain+0x85/0xb6 [libsas]
      [  368.805336]  [<ffffffff8104e5d9>] ? process_one_work+0x151/0x27c
      [  368.805351]  [<ffffffff8104f6cd>] ? worker_thread+0xbb/0x152
      [  368.805366]  [<ffffffff8104f612>] ? manage_workers.isra.29+0x163/0x163
      [  368.805382]  [<ffffffff81052c4e>] ? kthread+0x79/0x81
      [  368.805399]  [<ffffffff8135fea4>] ? kernel_thread_helper+0x4/0x10
      [  368.805416]  [<ffffffff81052bd5>] ? kthread_flush_work_fn+0x9/0x9
      [  368.805431]  [<ffffffff8135fea0>] ? gs_change+0x13/0x13
      [  368.805442] Code: 83 7d 30 63 7e 04 f3 90 eb ab 4c 8d 63 04 4c 8d 7b 08 4c 89 e7 e8 fa 15 00 00 48 8b 43 10 4c 89 3c 24 48 89 63 10 48 89 44 24 08 <48> 89 20 83 c8 ff 48 89 6c 24 10 87 03 ff c8 74 35 4d 89 ee 41
      [  368.805851] RIP  [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
      [  368.805877]  RSP <ffff880117001cc0>
      [  368.805886] CR2: 0000000000000000
      [  368.805899] ---[ end trace b720682065d8f4cc ]---
      
      It's directly caused by 89d3cf6a [SCSI] libsas: add mutex for SMP task
      execution, but shows a deeper cause: expander functions expect to be able to
      cast to and treat domain devices as expanders.  The correct fix is to only do
      expander discover when we know we've got an expander device to avoid wrongly
      casting a non-expander device.
      Reported-by: default avatarPraveen Murali <pmurali@logicube.com>
      Tested-by: default avatarPraveen Murali <pmurali@logicube.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      74101162
    • Jan Beulich's avatar
      xen-pciback: limit guest control of command register · f82a9cff
      Jan Beulich authored
      commit af6fc858 upstream.
      
      Otherwise the guest can abuse that control to cause e.g. PCIe
      Unsupported Request responses by disabling memory and/or I/O decoding
      and subsequently causing (CPU side) accesses to the respective address
      ranges, which (depending on system configuration) may be fatal to the
      host.
      
      Note that to alter any of the bits collected together as
      PCI_COMMAND_GUEST permissive mode is now required to be enabled
      globally or on the specific device.
      
      This is CVE-2015-2150 / XSA-120.
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f82a9cff
    • Thomas Hellstrom's avatar
      drm/vmwgfx: Reorder device takedown somewhat · bb025b7b
      Thomas Hellstrom authored
      commit 3458390b upstream.
      
      To take down the MOB and GMR memory types, the driver may have to issue
      fence objects and thus make sure that the fence manager is taken down
      after those memory types.
      Reorder device init accordingly.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bb025b7b
    • Ryusuke Konishi's avatar
      nilfs2: fix deadlock of segment constructor during recovery · 6fcad4d1
      Ryusuke Konishi authored
      commit 283ee148 upstream.
      
      According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
      during recovery at mount time.  The code path that caused the deadlock was
      as follows:
      
        nilfs_fill_super()
          load_nilfs()
            nilfs_salvage_orphan_logs()
              * Do roll-forwarding, attach segment constructor for recovery,
                and kick it.
      
              nilfs_segctor_thread()
                nilfs_segctor_thread_construct()
                 * A lock is held with nilfs_transaction_lock()
                   nilfs_segctor_do_construct()
                     nilfs_segctor_drop_written_files()
                       iput()
                         iput_final()
                           write_inode_now()
                             writeback_single_inode()
                               __writeback_single_inode()
                                 do_writepages()
                                   nilfs_writepage()
                                     nilfs_construct_dsync_segment()
                                       nilfs_transaction_lock() --> deadlock
      
      This can happen if commit 7ef3ff2f ("nilfs2: fix deadlock of segment
      constructor over I_SYNC flag") is applied and roll-forward recovery was
      performed at mount time.  The roll-forward recovery can happen if datasync
      write is done and the file system crashes immediately after that.  For
      instance, we can reproduce the issue with the following steps:
      
       < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
       # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
       # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
       count=1 && reboot -nfh
       < the system will immediately reboot >
       # mount -t nilfs2 /dev/sdb1 /nilfs
      
      The deadlock occurs because iput() can run segment constructor through
      writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
      above commit changed segment constructor so that it calls iput()
      asynchronously for inodes with i_nlink == 0, but that change was
      imperfect.
      
      This fixes the another deadlock by deferring iput() in segment constructor
      even for the case that mount is not finished, that is, for the case that
      MS_ACTIVE flag is not set.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Reported-by: default avatarYuxuan Shui <yshuiv7@gmail.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6fcad4d1
    • Doug Anderson's avatar
      regulator: core: Fix enable GPIO reference counting · 3f37b376
      Doug Anderson authored
      commit 29d62ec5 upstream.
      
      Normally _regulator_do_enable() isn't called on an already-enabled
      rdev.  That's because the main caller, _regulator_enable() always
      calls _regulator_is_enabled() and only calls _regulator_do_enable() if
      the rdev was not already enabled.
      
      However, there is one caller of _regulator_do_enable() that doesn't
      check: regulator_suspend_finish().  While we might want to make
      regulator_suspend_finish() behave more like _regulator_enable(), it's
      probably also a good idea to make _regulator_do_enable() robust if it
      is called on an already enabled rdev.
      
      At the moment, _regulator_do_enable() is _not_ robust for already
      enabled rdevs if we're using an ena_pin.  Each time
      _regulator_do_enable() is called for an rdev using an ena_pin the
      reference count of the ena_pin is incremented even if the rdev was
      already enabled.  This is not as intended because the ena_pin is for
      something else: for keeping track of how many active rdevs there are
      sharing the same ena_pin.
      
      Here's how the reference counting works here:
      
      * Each time _regulator_enable() is called we increment
        rdev->use_count, so _regulator_enable() calls need to be balanced
        with _regulator_disable() calls.
      
      * There is no explicit reference counting in _regulator_do_enable()
        which is normally just a warapper around rdev->desc->ops->enable()
        with code for supporting delays.  It's not expected that the
        "ops->enable()" call do reference counting.
      
      * Since regulator_ena_gpio_ctrl() does have reference counting
        (handling the sharing of the pin amongst multiple rdevs), we
        shouldn't call it if the current rdev is already enabled.
      
      Note that as part of this we cleanup (remove) the initting of
      ena_gpio_state in regulator_register().  In _regulator_do_enable(),
      _regulator_do_disable() and _regulator_is_enabled() is is clear that
      ena_gpio_state should be the state of whether this particular rdev has
      requested the GPIO be enabled.  regulator_register() was initting it
      as the actual state of the pin.
      
      Fixes: 967cfb18 ("regulator: core: manage enable GPIO list")
      Signed-off-by: default avatarDoug Anderson <dianders@chromium.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3f37b376
    • Javier Martinez Canillas's avatar
      regulator: Only enable disabled regulators on resume · d441a06e
      Javier Martinez Canillas authored
      commit 0548bf4f upstream.
      
      The _regulator_do_enable() call ought to be a no-op when called on an
      already-enabled regulator.  However, as an optimization
      _regulator_enable() doesn't call _regulator_do_enable() on an already
      enabled regulator.  That means we never test the case of calling
      _regulator_do_enable() during normal usage and there may be hidden
      bugs or warnings.  We have seen warnings issued by the tps65090 driver
      and bugs when using the GPIO enable pin.
      
      Let's match the same optimization that _regulator_enable() in
      regulator_suspend_finish().  That may speed up suspend/resume and also
      avoids exposing hidden bugs.
      
      [Use much clearer commit message from Doug Anderson]
      Signed-off-by: default avatarJavier Martinez Canillas <javier.martinez@collabora.co.uk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d441a06e
    • Brian King's avatar
      bnx2x: Force fundamental reset for EEH recovery · 629e1fd8
      Brian King authored
      commit da293700 upstream.
      
      EEH recovery for bnx2x based adapters is not reliable on all Power
      systems using the default hot reset, which can result in an
      unrecoverable EEH error. Forcing the use of fundamental reset
      during EEH recovery fixes this.
      Signed-off-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      629e1fd8
    • Takashi Iwai's avatar
      ALSA: hda - Treat stereo-to-mono mix properly · fd2f4b31
      Takashi Iwai authored
      commit cc261738 upstream.
      
      The commit [ef403edb: ALSA: hda - Don't access stereo amps for
      mono channel widgets] fixed the handling of mono widgets in general,
      but it still misses an exceptional case: namely, a mono mixer widget
      taking a single stereo input.  In this case, it has stereo volumes
      although it's a mono widget, and thus we have to take care of both
      left and right input channels, as stated in HD-audio spec ("7.1.3
      Widget Interconnection Rules").
      
      This patch covers this missing piece by adding proper checks of stereo
      amps in both the generic parser and the proc output codes.
      Reported-by: default avatarRaymond Yau <superquad.vortex2@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fd2f4b31
    • Takashi Iwai's avatar
      ALSA: hda - Add workaround for MacBook Air 5,2 built-in mic · 3453d865
      Takashi Iwai authored
      commit 2ddee91a upstream.
      
      MacBook Air 5,2 has the same problem as MacBook Pro 8,1 where the
      built-in mic records only the right channel.  Apply the same
      workaround as MBP8,1 to spread the mono channel via a Cirrus codec
      vendor-specific COEF setup.
      Reported-and-tested-by: default avatarVasil Zlatanov <vasil.zlatanov@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3453d865
    • Takashi Iwai's avatar
      ALSA: hda - Set single_adc_amp flag for CS420x codecs · 5917ae96
      Takashi Iwai authored
      commit bad994f5 upstream.
      
      CS420x codecs seem to deal only the single amps of ADC nodes even
      though the nodes receive multiple inputs.  This leads to the
      inconsistent amp value after S3/S4 resume, for example.
      
      The fix is just to set codec->single_adc_amp flag.  Then the driver
      handles these ADC amps as if single connections.
      Reported-and-tested-by: default avatarVasil Zlatanov <vasil.zlatanov@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5917ae96
    • Takashi Iwai's avatar
      ALSA: hda - Don't access stereo amps for mono channel widgets · cf0a9563
      Takashi Iwai authored
      commit ef403edb upstream.
      
      The current HDA generic parser initializes / modifies the amp values
      always in stereo, but this seems causing the problem on ALC3229 codec
      that has a few mono channel widgets: namely, these mono widgets react
      to actions for both channels equally.
      
      In the driver code, we do care the mono channel and create a control
      only for the left channel (as defined in HD-audio spec) for such a
      node.  When the control is updated, only the left channel value is
      changed.  However, in the resume, the right channel value is also
      restored from the initial value we took as stereo, and this overwrites
      the left channel value.  This ends up being the silent output as the
      right channel has been never touched and remains muted.
      
      This patch covers the places where unconditional stereo amp accesses
      are done and converts to the conditional accesses.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94581Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cf0a9563
    • Takashi Iwai's avatar
      ALSA: hda - Fix built-in mic on Compaq Presario CQ60 · c3e88945
      Takashi Iwai authored
      commit ddb6ca75 upstream.
      
      Compaq Presario CQ60 laptop with CX20561 gives a wrong pin for the
      built-in mic NID 0x17 instead of NID 0x1d, and it results in the
      non-working mic.  This patch just remaps the pin correctly via fixup.
      
      Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=920604Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c3e88945
    • Takashi Iwai's avatar
      ALSA: control: Add sanity checks for user ctl id name string · f6b485a4
      Takashi Iwai authored
      commit be3bb823 upstream.
      
      There was no check about the id string of user control elements, so we
      accepted even a control element with an empty string, which is
      obviously bogus.  This patch adds more sanity checks of id strings.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f6b485a4
    • Daniel Mack's avatar
      ALSA: snd-usb: add quirks for Roland UA-22 · 56296033
      Daniel Mack authored
      commit fcdcd1de upstream.
      
      The device complies to the UAC1 standard but hides that fact with
      proprietary descriptors. The autodetect quirk for Roland devices
      catches the audio interface but misses the MIDI part, so a specific
      quirk is needed.
      Signed-off-by: default avatarDaniel Mack <daniel@zonque.org>
      Reported-by: default avatarRafa Lafuente <rafalafuente@gmail.com>
      Tested-by: default avatarRaphaël Doursenaud <raphael@doursenaud.fr>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      56296033
    • Alexander Sverdlin's avatar
      spi: pl022: Fix race in giveback() leading to driver lock-up · a0485132
      Alexander Sverdlin authored
      commit cd6fa8d2 upstream.
      
      Commit fd316941 ("spi/pl022: disable port when unused") introduced a race,
      which leads to possible driver lock up (easily reproducible on SMP).
      
      The problem happens in giveback() function where the completion of the transfer
      is signalled to SPI subsystem and then the HW SPI controller is disabled. Another
      transfer might be setup in between, which brings driver in locked-up state.
      
      Exact event sequence on SMP:
      
      core0                                   core1
      
                                              => pump_transfers()
                                              /* message->state == STATE_DONE */
                                                => giveback()
                                                  => spi_finalize_current_message()
      
      => pl022_unprepare_transfer_hardware()
      => pl022_transfer_one_message
        => flush()
        => do_interrupt_dma_transfer()
          => set_up_next_transfer()
          /* Enable SSP, turn on interrupts */
          writew((readw(SSP_CR1(pl022->virtbase)) |
                 SSP_CR1_MASK_SSE), SSP_CR1(pl022->virtbase));
      
      ...
      
      => pl022_interrupt_handler()
        => readwriter()
      
                                              /* disable the SPI/SSP operation */
                                              => writew((readw(SSP_CR1(pl022->virtbase)) &
                                                        (~SSP_CR1_MASK_SSE)), SSP_CR1(pl022->virtbase));
      
      Lockup! SPI controller is disabled and the data will never be received. Whole
      SPI subsystem is waiting for transfer ACK and blocked.
      
      So, only signal transfer completion after disabling the controller.
      
      Fixes: fd316941 (spi/pl022: disable port when unused)
      Signed-off-by: default avatarAlexander Sverdlin <alexander.sverdlin@nokia.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a0485132
    • jmlatten@linux.vnet.ibm.com's avatar
      tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send · 78401b55
      jmlatten@linux.vnet.ibm.com authored
      commit 62dfd912 upstream.
      
      Problem: When IMA and VTPM are both enabled in kernel config,
      kernel hangs during bootup on LE OS.
      
      Why?: IMA calls tpm_pcr_read() which results in tpm_ibmvtpm_send
      and tpm_ibmtpm_recv getting called. A trace showed that
      tpm_ibmtpm_recv was hanging.
      
      Resolution: tpm_ibmtpm_recv was hanging because tpm_ibmvtpm_send
      was sending CRQ message that probably did not make much sense
      to phype because of Endianness. The fix below sends correctly
      converted CRQ for LE. This was not caught before because it
      seems IMA is not enabled by default in kernel config and
      IMA exercises this particular code path in vtpm.
      
      Tested with IMA and VTPM enabled in kernel config and VTPM
      enabled on both a BE OS and a LE OS ppc64 lpar. This exercised
      CRQ and TPM command code paths in vtpm.
      Patch is against Peter's tpmdd tree on github which included
      Vicky's previous vtpm le patches.
      Signed-off-by: default avatarJoy Latten <jmlatten@linux.vnet.ibm.com>
      Reviewed-by: default avatarAshley Lai <ashley@ahsleylai.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      78401b55
    • Jason Low's avatar
      cpuset: Fix cpuset sched_relax_domain_level · 90b682b6
      Jason Low authored
      commit 283cb41f upstream.
      
      The cpuset.sched_relax_domain_level can control how far we do
      immediate load balancing on a system. However, it was found on recent
      kernels that echo'ing a value into cpuset.sched_relax_domain_level
      did not reduce any immediate load balancing.
      
      The reason this occurred was because the update_domain_attr_tree() traversal
      did not update for the "top_cpuset". This resulted in nothing being changed
      when modifying the sched_relax_domain_level parameter.
      
      This patch is able to address that problem by having update_domain_attr_tree()
      allow updates for the root in the cpuset traversal.
      
      Fixes: fc560a26 ("cpuset: replace cpuset->stack_list with cpuset_for_each_descendant_pre()")
      Signed-off-by: default avatarJason Low <jason.low2@hp.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Tested-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      90b682b6
    • Tejun Heo's avatar
      workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE · 97b57f41
      Tejun Heo authored
      commit 8603e1b3 upstream.
      
      cancel[_delayed]_work_sync() are implemented using
      __cancel_work_timer() which grabs the PENDING bit using
      try_to_grab_pending() and then flushes the work item with PENDING set
      to prevent the on-going execution of the work item from requeueing
      itself.
      
      try_to_grab_pending() can always grab PENDING bit without blocking
      except when someone else is doing the above flushing during
      cancelation.  In that case, try_to_grab_pending() returns -ENOENT.  In
      this case, __cancel_work_timer() currently invokes flush_work().  The
      assumption is that the completion of the work item is what the other
      canceling task would be waiting for too and thus waiting for the same
      condition and retrying should allow forward progress without excessive
      busy looping
      
      Unfortunately, this doesn't work if preemption is disabled or the
      latter task has real time priority.  Let's say task A just got woken
      up from flush_work() by the completion of the target work item.  If,
      before task A starts executing, task B gets scheduled and invokes
      __cancel_work_timer() on the same work item, its try_to_grab_pending()
      will return -ENOENT as the work item is still being canceled by task A
      and flush_work() will also immediately return false as the work item
      is no longer executing.  This puts task B in a busy loop possibly
      preventing task A from executing and clearing the canceling state on
      the work item leading to a hang.
      
      task A			task B			worker
      
      						executing work
      __cancel_work_timer()
        try_to_grab_pending()
        set work CANCELING
        flush_work()
          block for work completion
      						completion, wakes up A
      			__cancel_work_timer()
      			while (forever) {
      			  try_to_grab_pending()
      			    -ENOENT as work is being canceled
      			  flush_work()
      			    false as work is no longer executing
      			}
      
      This patch removes the possible hang by updating __cancel_work_timer()
      to explicitly wait for clearing of CANCELING rather than invoking
      flush_work() after try_to_grab_pending() fails with -ENOENT.
      
      Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
      
      v3: bit_waitqueue() can't be used for work items defined in vmalloc
          area.  Switched to custom wake function which matches the target
          work item and exclusive wait and wakeup.
      
      v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
          the target bit waitqueue has wait_bit_queue's on it.  Use
          DEFINE_WAIT_BIT() and __wake_up_bit() instead.  Reported by Tomeu
          Vizoso.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarRabin Vincent <rabin.vincent@axis.com>
      Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
      Tested-by: default avatarJesper Nilsson <jesper.nilsson@axis.com>
      Tested-by: default avatarRabin Vincent <rabin.vincent@axis.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      97b57f41
    • Oliver Hartkopp's avatar
      can: add missing initialisations in CAN related skbuffs · c05e6160
      Oliver Hartkopp authored
      commit 96943901 upstream.
      
      When accessing CAN network interfaces with AF_PACKET sockets e.g. by dhclient
      this can lead to a skb_under_panic due to missing skb initialisations.
      
      Add the missing initialisations at the CAN skbuff creation times on driver
      level (rx path) and in the network layer (tx path).
      Reported-by: default avatarAustin Schuh <austin@peloton-tech.com>
      Reported-by: default avatarDaniel Steer <daniel.steer@mclaren.com>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c05e6160
    • Russell King's avatar
      Change email address for 8250_pci · 6be916ce
      Russell King authored
      commit f2e0ea86 upstream.
      
      I'm still receiving reports to my email address, so let's point this
      at the linux-serial mailing list instead.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6be916ce
    • Michael S. Tsirkin's avatar
      virtio_console: avoid config access from irq · 02f08b63
      Michael S. Tsirkin authored
      commit eeb8a7e8 upstream.
      
      when multiport is off, virtio console invokes config access from irq
      context, config access is blocking on s390.
      Fix this up by scheduling work from config irq - similar to what we do
      for multiport configs.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarAmit Shah <amit.shah@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      02f08b63
    • Michael S. Tsirkin's avatar
      virtio_console: init work unconditionally · 8a4b192f
      Michael S. Tsirkin authored
      commit 4f6e24ed upstream.
      
      when multiport is off, we don't initialize config work,
      but we then cancel uninitialized control_work on freeze.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarAmit Shah <amit.shah@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8a4b192f