1. 10 Feb, 2015 3 commits
  2. 06 Feb, 2015 5 commits
    • Alex Williamson's avatar
      vfio: Tie IOMMU group reference to vfio group · 4a68810d
      Alex Williamson authored
      Move the iommu_group reference from the device to the vfio_group.
      This ensures that the iommu_group persists as long as the vfio_group
      remains.  This can be important if all of the device from an
      iommu_group are removed, but we still have an outstanding vfio_group
      reference; we can still walk the empty list of devices.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      4a68810d
    • Alex Williamson's avatar
      vfio: Add device tracking during unbind · 60720a0f
      Alex Williamson authored
      There's a small window between the vfio bus driver calling
      vfio_del_group_dev() and the device being completely unbound where
      the vfio group appears to be non-viable.  This creates a race for
      users like QEMU/KVM where the kvm-vfio module tries to get an
      external reference to the group in order to match and release an
      existing reference, while the device is potentially being removed
      from the vfio bus driver.  If the group is momentarily non-viable,
      kvm-vfio may not be able to release the group reference until VM
      shutdown, making the group unusable until that point.
      
      Bridge the gap between device removal from the group and completion
      of the driver unbind by tracking it in a list.  The device is added
      to the list before the bus driver reference is released and removed
      using the existing unbind notifier.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      60720a0f
    • Alex Williamson's avatar
      vfio/type1: Add conditional rescheduling · c5e66887
      Alex Williamson authored
      IOMMU operations can be expensive and it's not very difficult for a
      user to give us a lot of work to do for a map or unmap operation.
      Killing a large VM will vfio assigned devices can result in soft
      lockups and IOMMU tracing shows that we can easily spend 80% of our
      time with need-resched set.  A sprinkling of conf_resched() calls
      after map and unmap calls has a very tiny affect on performance
      while resulting in traces with <1% of calls overflowing into needs-
      resched.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      c5e66887
    • Alex Williamson's avatar
      vfio/type1: Chunk contiguous reserved/invalid page mappings · babbf176
      Alex Williamson authored
      We currently map invalid and reserved pages, such as often occur from
      mapping MMIO regions of a VM through the IOMMU, using single pages.
      There's really no reason we can't instead follow the methodology we
      use for normal pages and find the largest possible physically
      contiguous chunk for mapping.  The only difference is that we don't
      do locked memory accounting for these since they're not back by RAM.
      
      In most applications this will be a very minor improvement, but when
      graphics and GPGPU devices are in play, MMIO BARs become non-trivial.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      babbf176
    • Alex Williamson's avatar
      vfio/type1: DMA unmap chunking · 6fe1010d
      Alex Williamson authored
      When unmapping DMA entries we try to rely on the IOMMU API behavior
      that allows the IOMMU to unmap a larger area than requested, up to
      the size of the original mapping.  This works great when the IOMMU
      supports superpages *and* they're in use.  Otherwise, each PAGE_SIZE
      increment is unmapped separately, resulting in poor performance.
      
      Instead we can use the IOVA-to-physical-address translation provided
      by the IOMMU API and unmap using the largest contiguous physical
      memory chunk available, which is also how vfio/type1 would have
      mapped the region.  For a synthetic 1TB guest VM mapping and shutdown
      test on Intel VT-d (2M IOMMU pagesize support), this achieves about
      a 30% overall improvement mapping standard 4K pages, regardless of
      IOMMU superpage enabling, and about a 40% improvement mapping 2M
      hugetlbfs pages when IOMMU superpages are not available.  Hugetlbfs
      with IOMMU superpages enabled is effectively unchanged.
      
      Unfortunately the same algorithm does not work well on IOMMUs with
      fine-grained superpages, like AMD-Vi, costing about 25% extra since
      the IOMMU will automatically unmap any power-of-two contiguous
      mapping we've provided it.  We add a routine and a domain flag to
      detect this feature, leaving AMD-Vi unaffected by this unmap
      optimization.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      6fe1010d
  3. 02 Feb, 2015 1 commit
  4. 01 Feb, 2015 5 commits
    • Linus Torvalds's avatar
      Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · fba7e994
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "One more week's worth of fixes.  Worth pointing out here are:
      
         - A patch fixing detaching of iommu registrations when a device is
           removed -- earlier the ops pointer wasn't managed properly
         - Another set of Renesas boards get the same GIC setup fixup as
           others have in previous -rcs
         - Serial port aliases fixups for sunxi.  We did the same to tegra but
           we caught that in time before the merge window due to more machines
           being affected.  Here it took longer for anyone to notice.
         - A couple more DT tweaks on sunxi
         - A follow-up patch for the mvebu coherency disabling in last -rc
           batch"
      
      * tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device()
        ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
        ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
        ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled
        ARM: sunxi: dt: Fix aliases
        ARM: dts: sun4i: Add simplefb node with de_fe0-de_be0-lcd0-hdmi pipeline
        ARM: dts: sun6i: ippo-q8h-v5: Fix serial0 alias
        ARM: dts: sunxi: Fix usb-phy support for sun4i/sun5i
      fba7e994
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 3441456b
      Linus Torvalds authored
      Pull input layer updates from Dmitry Torokhov:
       "Just a few quirks for PS/2 this time"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: elantech - add more Fujtisu notebooks to force crc_enabled
        Input: i8042 - add noloop quirk for Medion Akoya E7225 (MD98857)
        Input: synaptics - adjust min/max for Lenovo ThinkPad X1 Carbon 2nd
      3441456b
    • Linus Torvalds's avatar
      sched: don't cause task state changes in nested sleep debugging · 00845eb9
      Linus Torvalds authored
      Commit 8eb23b9f ("sched: Debug nested sleeps") added code to report
      on nested sleep conditions, which we generally want to avoid because the
      inner sleeping operation can re-set the thread state to TASK_RUNNING,
      but that will then cause the outer sleep loop not actually sleep when it
      calls schedule.
      
      However, that's actually valid traditional behavior, with the inner
      sleep being some fairly rare case (like taking a sleeping lock that
      normally doesn't actually need to sleep).
      
      And the debug code would actually change the state of the task to
      TASK_RUNNING internally, which makes that kind of traditional and
      working code not work at all, because now the nested sleep doesn't just
      sometimes cause the outer one to not block, but will cause it to happen
      every time.
      
      In particular, it will cause the cardbus kernel daemon (pccardd) to
      basically busy-loop doing scheduling, converting a laptop into a heater,
      as reported by Bruno Prémont.  But there may be other legacy uses of
      that nested sleep model in other drivers that are also likely to never
      get converted to the new model.
      
      This fixes both cases:
      
       - don't set TASK_RUNNING when the nested condition happens (note: even
         if WARN_ONCE() only _warns_ once, the return value isn't whether the
         warning happened, but whether the condition for the warning was true.
         So despite the warning only happening once, the "if (WARN_ON(..))"
         would trigger for every nested sleep.
      
       - in the cases where we knowingly disable the warning by using
         "sched_annotate_sleep()", don't change the task state (that is used
         for all core scheduling decisions), instead use '->task_state_change'
         that is used for the debugging decision itself.
      
      (Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
      avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
      suggested change to use 'task_state_change' as part of the test)
      Reported-and-bisected-by: default avatarBruno Prémont <bonbons@linux-vserver.org>
      Tested-by: default avatarRafael J Wysocki <rjw@rjwysocki.net>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>,
      Cc: Ilya Dryomov <ilya.dryomov@inktank.com>,
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>,
      Cc: Davidlohr Bueso <dave@stgolabs.net>,
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00845eb9
    • Rainer Koenig's avatar
      Input: elantech - add more Fujtisu notebooks to force crc_enabled · 47c1ffb2
      Rainer Koenig authored
      Add two more Fujitsu LIFEBOOK models that also ship with the Elantech
      touchpad and don't work with crc_disabled to the quirk list.
      Signed-off-by: default avatarRainer Koenig <Rainer.Koenig@ts.fujitsu.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      47c1ffb2
    • Olof Johansson's avatar
      Merge tag 'renesas-soc-fixes3-for-v3.19' of... · 28111dda
      Olof Johansson authored
      Merge tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
      
      Merge "Third Round of Renesas ARM Based SoC Fixes for v3.19" from Simon Horman:
      
      * Instantiate GIC from C board code in legacy builds on r8a7790 and r8a73a4
      
      * tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
        ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
        ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      28111dda
  5. 31 Jan, 2015 4 commits
  6. 30 Jan, 2015 12 commits
  7. 29 Jan, 2015 10 commits