Commits · 886b470cb14733a0286e365c77f1844c240c33a4 · nexedi / linux

28 Nov, 2012 12 commits

KVM: x86: pass host_tsc to read_l1_tsc · 886b470c

Marcelo Tosatti authored Nov 27, 2012

Allow the caller to pass host tsc value to kvm_x86_ops->read_l1_tsc().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

886b470c

x86: vdso: pvclock gettime support · 51c19b4f

Marcelo Tosatti authored Nov 27, 2012

Improve performance of time system calls when using Linux pvclock,
by reading time info from fixmap visible copy of pvclock data.

Originally from Jeremy Fitzhardinge.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

51c19b4f

x86: kvm guest: pvclock vsyscall support · 3dc4f7cf

Marcelo Tosatti authored Nov 27, 2012

Hook into generic pvclock vsyscall code, with the aim to
allow userspace to have visibility into pvclock data.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

3dc4f7cf

x86: pvclock: generic pvclock vsyscall initialization · 71056ae2

Marcelo Tosatti authored Nov 27, 2012

Originally from Jeremy Fitzhardinge.

Introduce generic, non hypervisor specific, pvclock initialization
routines.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

71056ae2

sched: add notifier for cross-cpu migrations · 582b336e

Marcelo Tosatti authored Nov 27, 2012

Originally from Jeremy Fitzhardinge.
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

582b336e

x86: pvclock: add note about rdtsc barriers · 189e1173

Marcelo Tosatti authored Nov 27, 2012

As noted by Gleb, not advertising SSE2 support implies
no RDTSC barriers.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

189e1173

x86: pvclock: introduce helper to read flags · 2697902b

Marcelo Tosatti authored Nov 27, 2012

Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

2697902b

x86: pvclock: create helper for pvclock data retrieval · dce2db0a

Marcelo Tosatti authored Nov 27, 2012

Originally from Jeremy Fitzhardinge.

So code can be reused.
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

dce2db0a

x86: pvclock: remove pvclock_shadow_time · 42b5637d

Marcelo Tosatti authored Nov 27, 2012

Originally from Jeremy Fitzhardinge.

We can copy the information directly from "struct pvclock_vcpu_time_info",
remove pvclock_shadow_time.
Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

42b5637d

x86: pvclock: make sure rdtsc doesnt speculate out of region · b01578de

Marcelo Tosatti authored Nov 27, 2012

Originally from Jeremy Fitzhardinge.

pvclock_get_time_values, which contains the memory barriers
will be removed by next patch.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

b01578de

x86: kvmclock: allocate pvclock shared memory area · 7069ed67

Marcelo Tosatti authored Nov 27, 2012

We want to expose the pvclock shared memory areas, which
the hypervisor periodically updates, to userspace.

For a linear mapping from userspace, it is necessary that
entire page sized regions are used for array of pvclock
structures.

There is no such guarantee with per cpu areas, therefore move
to memblock_alloc based allocation.
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

7069ed67

KVM: x86: retain pvclock guest stopped bit in guest memory · 78c0337a

Marcelo Tosatti authored Nov 27, 2012

Otherwise its possible for an unrelated KVM_REQ_UPDATE_CLOCK (such as due to CPU
migration) to clear the bit.

Noticed by Paolo Bonzini.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

78c0337a

14 Nov, 2012 3 commits

KVM: remove unnecessary return value check · 807f12e5

Guo Chao authored Nov 02, 2012

No need to check return value before breaking switch.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

807f12e5

KVM: x86: fix return value of kvm_vm_ioctl_set_tss_addr() · 951179ce

Guo Chao authored Nov 02, 2012

Return value of this function will be that of ioctl().

#include <stdio.h>
#include <linux/kvm.h>

int main () {
	int fd;
	fd = open ("/dev/kvm", 0);
	fd = ioctl (fd, KVM_CREATE_VM, 0);
	ioctl (fd, KVM_SET_TSS_ADDR, 0xfffff000);
	perror ("");
	return 0;
}

Output is "Operation not permitted". That's not what
we want.

Return -EINVAL in this case.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

951179ce

KVM: do not kfree error pointer · 18595411

Guo Chao authored Nov 02, 2012

We should avoid kfree()ing error pointer in kvm_vcpu_ioctl() and
kvm_arch_vcpu_ioctl().
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

18595411

01 Nov, 2012 2 commits

Merge branch 'for-queue' of https://github.com/agraf/linux-2.6 into queue · f026399f

Marcelo Tosatti authored Oct 31, 2012

* 'for-queue' of https://github.com/agraf/linux-2.6:
  PPC: ePAPR: Convert hcall header to uapi (round 2)
  KVM: PPC: Book3S HV: Fix thinko in try_lock_hpte()
  KVM: PPC: Book3S HV: Allow DTL to be set to address 0, length 0
  KVM: PPC: Book3S HV: Fix accounting of stolen time
  KVM: PPC: Book3S HV: Run virtual core whenever any vcpus in it can run
  KVM: PPC: Book3S HV: Fixes for late-joining threads
  KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock
  KVM: PPC: Book3S HV: Fix some races in starting secondary threads
  KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online
  PPC: ePAPR: Convert header to uapi
  KVM: PPC: Move mtspr/mfspr emulation into own functions
  KVM: Documentation: Fix reentry-to-be-consistent paragraph
  KVM: PPC: 44x: fix DCR read/write

f026399f

KVM: SVM: update MAINTAINERS entry · 7de609c8

Joerg Roedel authored Oct 29, 2012

I have no access to my AMD email address anymore. Update
entry in MAINTAINERS to the new address.

Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

7de609c8

31 Oct, 2012 2 commits

PPC: ePAPR: Convert hcall header to uapi (round 2) · 63a19091

Alexander Graf authored Oct 31, 2012

The new uapi framework splits kernel internal and user space exported
bits of header files more cleanly. Adjust the ePAPR header accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>

63a19091

Merge commit 'origin/queue' into for-queue · 0588000e
Alexander Graf authored Oct 31, 2012
```
Conflicts:
	arch/powerpc/include/asm/Kbuild
	arch/powerpc/include/uapi/asm/Kbuild
```
0588000e

30 Oct, 2012 12 commits

KVM: PPC: Book3S HV: Fix thinko in try_lock_hpte() · 8b5869ad

Paul Mackerras authored Oct 15, 2012

This fixes an error in the inline asm in try_lock_hpte() where we
were erroneously using a register number as an immediate operand.
The bug only affects an error path, and in fact the code will still
work as long as the compiler chooses some register other than r0
for the "bits" variable.  Nevertheless it should still be fixed.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

8b5869ad

KVM: PPC: Book3S HV: Allow DTL to be set to address 0, length 0 · 9f8c8c78

Paul Mackerras authored Oct 15, 2012

Commit 55b665b0 ("KVM: PPC: Book3S HV: Provide a way for userspace
to get/set per-vCPU areas") includes a check on the length of the
dispatch trace log (DTL) to make sure the buffer is at least one entry
long.  This is appropriate when registering a buffer, but the
interface also allows for any existing buffer to be unregistered by
specifying a zero address.  In this case the length check is not
appropriate.  This makes the check conditional on the address being
non-zero.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

9f8c8c78

KVM: PPC: Book3S HV: Fix accounting of stolen time · c7b67670

Paul Mackerras authored Oct 15, 2012

Currently the code that accounts stolen time tends to overestimate the
stolen time, and will sometimes report more stolen time in a DTL
(dispatch trace log) entry than has elapsed since the last DTL entry.
This can cause guests to underflow the user or system time measured
for some tasks, leading to ridiculous CPU percentages and total runtimes
being reported by top and other utilities.

In addition, the current code was designed for the previous policy where
a vcore would only run when all the vcpus in it were runnable, and so
only counted stolen time on a per-vcore basis.  Now that a vcore can
run while some of the vcpus in it are doing other things in the kernel
(e.g. handling a page fault), we need to count the time when a vcpu task
is preempted while it is not running as part of a vcore as stolen also.

To do this, we bring back the BUSY_IN_HOST vcpu state and extend the
vcpu_load/put functions to count preemption time while the vcpu is
in that state.  Handling the transitions between the RUNNING and
BUSY_IN_HOST states requires checking and updating two variables
(accumulated time stolen and time last preempted), so we add a new
spinlock, vcpu->arch.tbacct_lock.  This protects both the per-vcpu
stolen/preempt-time variables, and the per-vcore variables while this
vcpu is running the vcore.

Finally, we now don't count time spent in userspace as stolen time.
The task could be executing in userspace on behalf of the vcpu, or
it could be preempted, or the vcpu could be genuinely stopped.  Since
we have no way of dividing up the time between these cases, we don't
count any of it as stolen.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

c7b67670

KVM: PPC: Book3S HV: Run virtual core whenever any vcpus in it can run · 8455d79e

Paul Mackerras authored Oct 15, 2012

Currently the Book3S HV code implements a policy on multi-threaded
processors (i.e. POWER7) that requires all of the active vcpus in a
virtual core to be ready to run before we run the virtual core.
However, that causes problems on reset, because reset stops all vcpus
except vcpu 0, and can also reduce throughput since all four threads
in a virtual core have to wait whenever any one of them hits a
hypervisor page fault.

This relaxes the policy, allowing the virtual core to run as soon as
any vcpu in it is runnable.  With this, the KVMPPC_VCPU_STOPPED state
and the KVMPPC_VCPU_BUSY_IN_HOST state have been combined into a single
KVMPPC_VCPU_NOTREADY state, since we no longer need to distinguish
between them.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

8455d79e

KVM: PPC: Book3S HV: Fixes for late-joining threads · 2f12f034

Paul Mackerras authored Oct 15, 2012

If a thread in a virtual core becomes runnable while other threads
in the same virtual core are already running in the guest, it is
possible for the latecomer to join the others on the core without
first pulling them all out of the guest.  Currently this only happens
rarely, when a vcpu is first started.  This fixes some bugs and
omissions in the code in this case.

First, we need to check for VPA updates for the latecomer and make
a DTL entry for it.  Secondly, if it comes along while the master
vcpu is doing a VPA update, we don't need to do anything since the
master will pick it up in kvmppc_run_core.  To handle this correctly
we introduce a new vcore state, VCORE_STARTING.  Thirdly, there is
a race because we currently clear the hardware thread's hwthread_req
before waiting to see it get to nap.  A latecomer thread could have
its hwthread_req cleared before it gets to test it, and therefore
never increment the nap_count, leading to messages about wait_for_nap
timeouts.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

2f12f034

KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock · 913d3ff9

Paul Mackerras authored Oct 15, 2012

There were a few places where we were traversing the list of runnable
threads in a virtual core, i.e. vc->runnable_threads, without holding
the vcore spinlock.  This extends the places where we hold the vcore
spinlock to cover everywhere that we traverse that list.

Since we possibly need to sleep inside kvmppc_book3s_hv_page_fault,
this moves the call of it from kvmppc_handle_exit out to
kvmppc_vcpu_run, where we don't hold the vcore lock.

In kvmppc_vcore_blocked, we don't actually need to check whether
all vcpus are ceded and don't have any pending exceptions, since the
caller has already done that.  The caller (kvmppc_run_vcpu) wasn't
actually checking for pending exceptions, so we add that.

The change of if to while in kvmppc_run_vcpu is to make sure that we
never call kvmppc_remove_runnable() when the vcore state is RUNNING or
EXITING.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

913d3ff9

KVM: PPC: Book3S HV: Fix some races in starting secondary threads · 7b444c67

Paul Mackerras authored Oct 15, 2012

Subsequent patches implementing in-kernel XICS emulation will make it
possible for IPIs to arrive at secondary threads at arbitrary times.
This fixes some races in how we start the secondary threads, which
if not fixed could lead to occasional crashes of the host kernel.

This makes sure that (a) we have grabbed all the secondary threads,
and verified that they are no longer in the kernel, before we start
any thread, (b) that the secondary thread loads its vcpu pointer
after clearing the IPI that woke it up (so we don't miss a wakeup),
and (c) that the secondary thread clears its vcpu pointer before
incrementing the nap count. It also removes unnecessary setting
of the vcpu and vcore pointers in the paca in kvmppc_core_vcpu_load.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

7b444c67

KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online · 512691d4

Paul Mackerras authored Oct 15, 2012

When a Book3S HV KVM guest is running, we need the host to be in
single-thread mode, that is, all of the cores (or at least all of
the cores where the KVM guest could run) to be running only one
active hardware thread.  This is because of the hardware restriction
in POWER processors that all of the hardware threads in the core
must be in the same logical partition.  Complying with this restriction
is much easier if, from the host kernel's point of view, only one
hardware thread is active.

This adds two hooks in the SMP hotplug code to allow the KVM code to
make sure that secondary threads (i.e. hardware threads other than
thread 0) cannot come online while any KVM guest exists.  The KVM
code still has to check that any core where it runs a guest has the
secondary threads offline, but having done that check it can now be
sure that they will not come online while the guest is running.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>

512691d4

PPC: ePAPR: Convert header to uapi · c99ec973

Alexander Graf authored Oct 27, 2012

The new uapi framework splits kernel internal and user space exported
bits of header files more cleanly. Adjust the ePAPR header accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>

c99ec973

KVM: PPC: Move mtspr/mfspr emulation into own functions · 388cf9ee

Alexander Graf authored Oct 06, 2012

The mtspr/mfspr emulation code became quite big over time. Move it
into its own function so things stay more readable.
Signed-off-by: Alexander Graf <agraf@suse.de>

388cf9ee

KVM: Documentation: Fix reentry-to-be-consistent paragraph · 686de182

Alexander Graf authored Oct 07, 2012

All user space offloaded instruction emulation needs to reenter kvm
to produce consistent state again. Fix the section in the documentation
to mention all of them.
Signed-off-by: Alexander Graf <agraf@suse.de>

686de182

KVM: PPC: 44x: fix DCR read/write · e43a0287

Alexander Graf authored Oct 06, 2012

When remembering the direction of a DCR transaction, we should write
to the same variable that we interpret on later when doing vcpu_run
again.
Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: stable@vger.kernel.org

e43a0287

29 Oct, 2012 4 commits

KVM: do not treat noslot pfn as a error pfn · 81c52c56

Xiao Guangrong authored Oct 16, 2012

This patch filters noslot pfn out from error pfns based on Marcelo comment:
noslot pfn is not a error pfn

After this patch,
- is_noslot_pfn indicates that the gfn is not in slot
- is_error_pfn indicates that the gfn is in slot but the error is occurred
  when translate the gfn to pfn
- is_error_noslot_pfn indicates that the pfn either it is error pfns or it
  is noslot pfn
And is_invalid_pfn can be removed, it makes the code more clean
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

81c52c56

Merge remote-tracking branch 'master' into queue · 19bf7f8a

Marcelo Tosatti authored Oct 29, 2012

Merge reason: development work has dependency on kvm patches merged
upstream.

Conflicts:
	arch/powerpc/include/asm/Kbuild
	arch/powerpc/include/asm/kvm_para.h
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

19bf7f8a

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 35fd3dc5

Linus Torvalds authored Oct 29, 2012

Pull Ceph fixes form Sage Weil:
 "There are two fixes in the messenger code, one that can trigger a NULL
  dereference, and one that error in refcounting (extra put).  There is
  also a trivial fix that in the fs client code that is triggered by NFS
  reexport."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix dentry reference leak in encode_fh()
  libceph: avoid NULL kref_put when osd reset races with alloc_msg
  rbd: reset BACKOFF if unable to re-queue

35fd3dc5

ceph: fix dentry reference leak in encode_fh() · 52eb5a90

David Zafman authored Oct 18, 2012

Call to d_find_alias() needs a corresponding dput()

This fixes http://tracker.newdream.net/issues/3271Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

52eb5a90

28 Oct, 2012 5 commits

Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · 6b0cb4ee

Linus Torvalds authored Oct 28, 2012

Pull i2c subsystem fixes from Jean Delvare.

* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  i2c-i801: Fix comment
  i2c-i801: Simplify dependency towards GPIOLIB
  i2c-stub: Move to drivers/i2c

6b0cb4ee

i2c-i801: Fix comment · 28901f57
Jean Delvare authored Oct 28, 2012
```
Signed-off-by: Jean Delvare <khali@linux-fr.org>
```
28901f57

i2c-i801: Simplify dependency towards GPIOLIB · 79e3e5b8

Jean Delvare authored Oct 28, 2012

Arbitrarily selecting GPIOLIB causes trouble on some architectures,
so don't do that. Instead, just make the optional multiplexing code
depend on CONFIG_I2C_MUX_GPIO instead of CONFIG_I2C_MUX for now. We
can revisit if the i2c-i801 driver ever supports other multiplexing
flavors.

Also make that optional code depend on DMI, as it won't do anything
without that.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>

79e3e5b8

i2c-stub: Move to drivers/i2c · 31d178bf

Jean Delvare authored Oct 28, 2012

Move the i2c-stub driver to drivers/i2c, to match the Kconfig entry.
This is less confusing that way.

I also fixed all checkpatch warnings and errors.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Peter Huewe <peterhuewe@gmx.de>

31d178bf

Linux 3.7-rc3 · 8f0d8163
Linus Torvalds authored Oct 28, 2012

8f0d8163