1. 02 May, 2007 9 commits
    • Hugh Dickins's avatar
      holepunch: fix mmap_sem i_mutex deadlock · 64f586d8
      Hugh Dickins authored
      sys_madvise has down_write of mmap_sem, then madvise_remove calls
      vmtruncate_range which takes i_mutex and i_alloc_sem: no, we can
      easily devise deadlocks from that ordering.
      
      madvise_remove drop mmap_sem while calling vmtruncate_range: luckily,
      since madvise_remove doesn't split or merge vmas, it's easy to handle
      this case with a NULL prev, without restructuring sys_madvise.  (Though
      sad to retake mmap_sem when it's unlikely to be needed, and certainly
      down_read is sufficient for MADV_REMOVE, unlike the other madvices.)
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      64f586d8
    • Hugh Dickins's avatar
      holepunch: fix disconnected pages after second truncate · 42988ea6
      Hugh Dickins authored
      shmem_truncate_range has its own truncate_inode_pages_range, to free any
      pages racily instantiated while it was in progress: a SHMEM_PAGEIN flag
      is set when this might have happened.  But holepunching gets no chance
      to clear that flag at the start of vmtruncate_range, so it's always set
      (unless a truncate came just before), so holepunch almost always does
      this second truncate_inode_pages_range.
      
      shmem holepunch has unlikely swap<->file races hereabouts whatever we do
      (without a fuller rework than is fit for this release): I was going to
      skip the second truncate in the punch_hole case, but Miklos points out
      that would make holepunch correctness more vulnerable to swapoff.  So
      keep the second truncate, but follow it by an unmap_mapping_range to
      eliminate the disconnected pages (freed from pagecache while still
      mapped in userspace) that it might have left behind.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      42988ea6
    • Hugh Dickins's avatar
      holepunch: fix shmem_truncate_range punch locking · 32576fd4
      Hugh Dickins authored
      Miklos Szeredi observes that during truncation of shmem page directories,
      info->lock is released to improve latency (after lowering i_size and
      next_index to exclude races); but this is quite wrong for holepunching,
      which receives no such protection from i_size or next_index, and is left
      vulnerable to races with shmem_unuse, shmem_getpage and shmem_writepage.
      
      Hold info->lock throughout when holepunching?  No, any user could prevent
      rescheduling for far too long.  Instead take info->lock just when needed:
      in shmem_free_swp when removing the swap entries, and whenever removing
      a directory page from the level above.  But so long as we remove before
      scanning, we can safely skip taking the lock at the lower levels, except
      at misaligned start and end of the hole.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      32576fd4
    • Hugh Dickins's avatar
      holepunch: fix shmem_truncate_range punching too far · ac66c863
      Hugh Dickins authored
      Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered
      in rare circumstances, because shmem_truncate_range() erroneously
      removes partially truncated directory pages at the end of the range:
      later reclaim on pages pointing to these removed directories triggers
      the BUG.  Indeed, and it can also cause data loss beyond the hole.
      
      Fix this as in the patch proposed by Miklos, but distinguish between
      "limit" (how far we need to search: ignore truncation's next_index
      optimization in the holepunch case - if there are races it's more
      consistent to act on the whole range specified) and "upper_limit"
      (how far we can free directory pages: generally we must be careful
      to keep partially punched pages, but can relax at end of file -
      i_size being held stable by i_mutex).
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      ac66c863
    • Avi Kivity's avatar
      KVM: MMU: Fix host memory corruption on i386 with >= 4GB ram · 036bb853
      Avi Kivity authored
      PAGE_MASK is an unsigned long, so using it to mask physical addresses on
      i386 (which are 64-bit wide) leads to truncation.  This can result in
      page->private of unrelated memory pages being modified, with disasterous
      results.
      
      Fix by not using PAGE_MASK for physical addresses; instead calculate
      the correct value directly from PAGE_SIZE.  Also fix a similar BUG_ON().
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      036bb853
    • Avi Kivity's avatar
      KVM: MMU: Fix guest writes to nonpae pde · 1c4b6343
      Avi Kivity authored
      KVM shadow page tables are always in pae mode, regardless of the guest
      setting.  This means that a guest pde (mapping 4MB of memory) is mapped
      to two shadow pdes (mapping 2MB each).
      
      When the guest writes to a pte or pde, we intercept the write and emulate it.
      We also remove any shadowed mappings corresponding to the write.  Since the
      mmu did not account for the doubling in the number of pdes, it removed the
      wrong entry, resulting in a mismatch between shadow page tables and guest
      page tables, followed shortly by guest memory corruption.
      
      This patch fixes the problem by detecting the special case of writing to
      a non-pae pde and adjusting the address and number of shadow pdes zapped
      accordingly.
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAvi Kivity <avi@qumranet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1c4b6343
    • Jiri Kosina's avatar
      HID: zeroing of bytes in output fields is bogus · 68e26a3d
      Jiri Kosina authored
      HID: zeroing of bytes in output fields is bogus
      
      This patch removes bogus zeroing of unused bits in output reports,
      introduced in Simon's patch in commit d4ae650a.
      According to the specification, any sane device should not care
      about values of unused bits.
      
      What is worse, the zeroing is done in a way which is broken and
      might clear certain bits in output reports which are actually
      _used_ - a device that has multiple fields with one value of
      the size 1 bit each might serve as an example of why this is
      bogus - the second call of hid_output_report() would clear the
      first bit of report, which has already been set up previously.
      
      This patch will break LEDs on SpaceNavigator, because this device
      is broken and takes into account the bits which it shouldn't touch.
      The quirk for this particular device will be provided in a separate
      patch.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      68e26a3d
    • Michael S. Tsirkin's avatar
      IB/mthca: Fix data corruption after FMR unmap on Sinai · d44da5e9
      Michael S. Tsirkin authored
      In mthca_arbel_fmr_unmap(), the high bits of the key are masked off.
      This gets rid of the effect of adjust_key(), which makes sure that
      bits 3 and 23 of the key are equal when the Sinai throughput
      optimization is enabled, and so it may happen that an FMR will end up
      with bits 3 and 23 in the key being different.  This causes data
      corruption, because when enabling the throughput optimization, the
      driver promises the HCA firmware that bits 3 and 23 of all memory keys
      will always be equal.
      
      Fix by re-applying adjust_key() after masking the key.
      
      Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for
      help in debug.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@dev.mellanox.co.il>
      Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      d44da5e9
    • NeilBrown's avatar
      knfsd: Use a spinlock to protect sk_info_authunix · bd862252
      NeilBrown authored
      sk_info_authunix is not being protected properly so the object that
      it points to can be cache_put twice, leading to corruption.
      
      We borrow svsk->sk_defer_lock to provide the protection.  We should probably
      rename that lock to have a more generic name - later.
      
      Thanks to Gabriel for reporting this.
      
      Cc: Greg Banks <gnb@melbourne.sgi.com>
      Cc: Gabriel Barazer <gabriel@oxeva.fr>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      bd862252
  2. 27 Apr, 2007 3 commits
  3. 26 Apr, 2007 2 commits
  4. 25 Apr, 2007 2 commits
  5. 13 Apr, 2007 24 commits