1. 18 Jul, 2010 10 commits
    • James Bottomley's avatar
      plist: Add plist_last · 12e4d0cc
      James Bottomley authored
      plist is currently used by the scheduler, which only needs to know the
      highest item in the list.  This adds plist_last which allows you to
      find the lowest.  This is necessary for using plists to implement a
      fast search of dynamic ranges in pm_qos which can have both highest
      and lowest criteria.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      12e4d0cc
    • Rafael J. Wysocki's avatar
      PM: Make it possible to avoid races between wakeup and system sleep · c125e96f
      Rafael J. Wysocki authored
      One of the arguments during the suspend blockers discussion was that
      the mainline kernel didn't contain any mechanisms making it possible
      to avoid races between wakeup and system suspend.
      
      Generally, there are two problems in that area.  First, if a wakeup
      event occurs exactly when /sys/power/state is being written to, it
      may be delivered to user space right before the freezer kicks in, so
      the user space consumer of the event may not be able to process it
      before the system is suspended.  Second, if a wakeup event occurs
      after user space has been frozen, it is not generally guaranteed that
      the ongoing transition of the system into a sleep state will be
      aborted.
      
      To address these issues introduce a new global sysfs attribute,
      /sys/power/wakeup_count, associated with a running counter of wakeup
      events and three helper functions, pm_stay_awake(), pm_relax(), and
      pm_wakeup_event(), that may be used by kernel subsystems to control
      the behavior of this attribute and to request the PM core to abort
      system transitions into a sleep state already in progress.
      
      The /sys/power/wakeup_count file may be read from or written to by
      user space.  Reads will always succeed (unless interrupted by a
      signal) and return the current value of the wakeup events counter.
      Writes, however, will only succeed if the written number is equal to
      the current value of the wakeup events counter.  If a write is
      successful, it will cause the kernel to save the current value of the
      wakeup events counter and to abort the subsequent system transition
      into a sleep state if any wakeup events are reported after the write
      has returned.
      
      [The assumption is that before writing to /sys/power/state user space
      will first read from /sys/power/wakeup_count.  Next, user space
      consumers of wakeup events will have a chance to acknowledge or
      veto the upcoming system transition to a sleep state.  Finally, if
      the transition is allowed to proceed, /sys/power/wakeup_count will
      be written to and if that succeeds, /sys/power/state will be written
      to as well.  Still, if any wakeup events are reported to the PM core
      by kernel subsystems after that point, the transition will be
      aborted.]
      
      Additionally, put a wakeup events counter into struct dev_pm_info and
      make these per-device wakeup event counters available via sysfs,
      so that it's possible to check the activity of various wakeup event
      sources within the kernel.
      
      To illustrate how subsystems can use pm_wakeup_event(), make the
      low-level PCI runtime PM wakeup-handling code use it.
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Acked-by: default avatarmarkgross <markgross@thegnar.org>
      Reviewed-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      c125e96f
    • Alan Stern's avatar
      PNPACPI: Add support for remote wakeup · b14e033e
      Alan Stern authored
      This patch (as1354) adds remote-wakeup support to the pnpacpi driver.
      The new can_wakeup method also allows other PNP protocol drivers
      (pnpbios or iaspnp) to add wakeup support, but I don't know enough
      about how they work to actually do it.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reviewed-by: default avatarBjorn Helgaas <bjorn.helgaas@hp.com>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      b14e033e
    • Alan Stern's avatar
      PM: describe kernel policy regarding wakeup defaults (v. 2) · 2430d12c
      Alan Stern authored
      This patch (as1381b) updates a comment describing the kernel's policy
      toward enabling wakeup by default.
      
      It also makes device_set_wakeup_capable() actually do something when
      CONFIG_PM isn't enabled.  It's not clear this is necessary; however if
      it isn't then device_init_wakeup() and device_can_wakeup() should also
      be do-nothing routines.  Furthermore, I don't expect this change to
      have any noticeable effect -- but if it does then clearly the old
      behavior was wrong.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      2430d12c
    • Cesar Eduardo Barros's avatar
      PM / Hibernate: Fix typos in comments in kernel/power/swap.c · 90133673
      Cesar Eduardo Barros authored
      There are a few typos in kernel/power/swap.c.  Fix them.
      Signed-off-by: default avatarCesar Eduardo Barros <cesarb@cesarb.net>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      90133673
    • Linus Torvalds's avatar
      a9f7f2e7
    • Roland McGrath's avatar
      x86: kprobes: fix swapped segment registers in kretprobe · a1974798
      Roland McGrath authored
      In commit f007ea26, the order of the %es and %ds segment registers
      got accidentally swapped, so synthesized 'struct pt_regs' frames
      have the two values inverted.  It's almost sure that these values
      never matter, and that they also never differ.  But wrong is wrong.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      a1974798
    • Linus Torvalds's avatar
      2044f228
    • Linus Torvalds's avatar
      Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 · bea9a6d2
      Linus Torvalds authored
      * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
        ocfs2: Silence gcc warning in ocfs2_write_zero_page().
        jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions
        ocfs2/dlm: Remove BUG_ON from migration in the rare case of a down node
        ocfs2: Don't duplicate pages past i_size during CoW.
        ocfs2: tighten up strlen() checking
        ocfs2: Make xattr reflink work with new local alloc reservation.
        ocfs2: make xattr extension work with new local alloc reservation.
        ocfs2: Remove the redundant cpu_to_le64.
        ocfs2/dlm: don't access beyond bitmap size
        ocfs2: No need to zero pages past i_size.
        ocfs2: Zero the tail cluster when extending past i_size.
        ocfs2: When zero extending, do it by page.
        ocfs2: Limit default local alloc size within bitmap range.
        ocfs2: Move orphan scan work to ocfs2_wq.
        fs/ocfs2/dlm: Add missing spin_unlock
      bea9a6d2
    • Linus Torvalds's avatar
      drm/i915: add 'reclaimable' to i915 self-reclaimable page allocations · cd9f040d
      Linus Torvalds authored
      The hibernate issues that got fixed in commit 985b823b ("drm/i915:
      fix hibernation since i915 self-reclaim fixes") turn out to have been
      incomplete.  Vefa Bicakci tested lots of hibernate cycles, and without
      the __GFP_RECLAIMABLE flag the system eventually fails to resume.
      
      With the flag added, Vefa can apparently hibernate forever (or until he
      gets bored running his automated scripts, whichever comes first).
      
      The reclaimable flag was there originally, and was one of the flags that
      were dropped (unintentionally) by commit 4bdadb97 ("drm/i915:
      Selectively enable self-reclaim") that introduced all these problems,
      but I didn't want to just blindly add back all the flags in commit
      985b823b, and it looked like __GFP_RECLAIM wasn't necessary.  It
      clearly was.
      
      I still suspect that there is some subtle reason we're missing that
      causes the problems, but __GFP_RECLAIMABLE is certainly not wrong to use
      in this context, and is what the code historically used.  And we have no
      idea what the causes the corruption without it.
      Reported-and-tested-by: default avatarM. Vefa Bicakci <bicave@superonline.com>
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd9f040d
  2. 16 Jul, 2010 7 commits
  3. 15 Jul, 2010 11 commits
    • Jan Kara's avatar
      jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions · 13ceef09
      Jan Kara authored
      OCFS2 uses t_commit trigger to compute and store checksum of the just
      committed blocks. When a buffer has b_frozen_data, checksum is computed
      for it instead of b_data but this can result in an old checksum being
      written to the filesystem in the following scenario:
      
      1) transaction1 is opened
      2) handle1 is opened
      3) journal_access(handle1, bh)
          - This sets jh->b_transaction to transaction1
      4) modify(bh)
      5) journal_dirty(handle1, bh)
      6) handle1 is closed
      7) start committing transaction1, opening transaction2
      8) handle2 is opened
      9) journal_access(handle2, bh)
          - This copies off b_frozen_data to make it safe for transaction1 to commit.
            jh->b_next_transaction is set to transaction2.
      10) jbd2_journal_write_metadata() checksums b_frozen_data
      11) the journal correctly writes b_frozen_data to the disk journal
      12) handle2 is closed
          - There was no dirty call for the bh on handle2, so it is never queued for
            any more journal operation
      13) Checkpointing finally happens, and it just spools the bh via normal buffer
      writeback.  This will write b_data, which was never triggered on and thus
      contains a wrong (old) checksum.
      
      This patch fixes the problem by calling the trigger at the moment data is
      frozen for journal commit - i.e., either when b_frozen_data is created by
      do_get_write_access or just before we write a buffer to the log if
      b_frozen_data does not exist. We also rename the trigger to t_frozen as
      that better describes when it is called.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      13ceef09
    • Wengang Wang's avatar
      ocfs2/dlm: Remove BUG_ON from migration in the rare case of a down node · a39953dd
      Wengang Wang authored
      For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set
      before sending DLM_MIG_LOCKRES_MSG message to the target. We are using
      dlm_migration_can_proceed() for that purpose.  However, if the node is
      down, dlm_migration_can_proceed() will also return "go ahead".  In this
      rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove
      the BUG_ON() that trips over this condition.
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      a39953dd
    • Tao Ma's avatar
      ocfs2: Don't duplicate pages past i_size during CoW. · f5e27b6d
      Tao Ma authored
      During CoW, the pages after i_size don't contain valid data, so there's
      no need to read and duplicate them.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      f5e27b6d
    • Bob Peterson's avatar
      GFS2: rename causes kernel Oops · 728a756b
      Bob Peterson authored
      This patch fixes a kernel Oops in the GFS2 rename code.
      
      The problem was in the way the gfs2 directory code was trying
      to re-use sentinel directory entries.
      
      In the failing case, gfs2's rename function was renaming a
      file to another name that had the same non-trivial length.
      The file being renamed happened to be the first directory
      entry on the leaf block.
      
      First, the rename code (gfs2_rename in ops_inode.c) found the
      original directory entry and decided it could do its job by
      simply replacing the directory entry with another.  Therefore
      it determined correctly that no block allocations were needed.
      
      Next, the rename code deleted the old directory entry prior to
      replacing it with the new name.  Therefore, the soon-to-be
      replaced directory entry was temporarily made into a directory
      entry "sentinel" or a place holder at the start of a leaf block.
      
      Lastly, it went to re-add the replacement directory entry in
      that leaf block.  However, when gfs2_dirent_find_space was
      looking for space in the leaf block, it used the wrong value
      for the sentinel.  That threw off its calculations so later
      it decides it can't really re-use the sentinel and therefore
      must allocate a new leaf block.  But because it previously decided
      to re-use the directory entry, it didn't waste the time to
      grab a new block allocation for the inode.  Therefore, the
      inode's i_alloc pointer was still NULL and it crashes trying to
      reference it.
      
      In the case of sentinel directory entries, the entire dirent is
      reused, not just the "free space" portion of it, and therefore
      the function gfs2_dirent_find_space should use the value 0
      rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.
      
      Fixing this calculation enables the reproducer programs to work
      properly.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      728a756b
    • Abhijith Das's avatar
      GFS2: BUG in gfs2_adjust_quota · 8b421601
      Abhijith Das authored
      HighMem pages on i686 do not get mapped to the buffer_heads and this was
      causing a NULL pointer dereference when we were trying to memset page buffers
      to zero.
      We now use zero_user() that kmaps the page and directly manipulates page data.
      This patch also fixes a boundary condition that was incorrect.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      8b421601
    • Bob Peterson's avatar
      GFS2: Fix kernel NULL pointer dereference by dlm_astd · b1becbde
      Bob Peterson authored
      This patch fixes a problem in an error path when looking
      up dinodes.  There are two sister-functions, gfs2_inode_lookup
      and gfs2_process_unlinked_inode.  Both functions acquire and
      hold the i_iopen glock for the dinode being looked up. The last
      thing they try to do is hold the i_gl glock for the dinode.
      If that glock fails for some reason, the error path was
      incorrectly calling gfs2_glock_put for the i_iopen glock twice.
      This resulted in the glock being prematurely freed.  The
      "minimum hold time" usually kept the glock in memory, but the
      lock interface to dlm (aka lock_dlm) freed its memory for the
      glock.  In some circumstances, it would cause dlm's dlm_astd daemon
      to try to call the bast function for the freed lock_dlm memory,
      which resulted in a NULL pointer dereference.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      b1becbde
    • Bob Peterson's avatar
      GFS2: recovery stuck on transaction lock · b7dc2df5
      Bob Peterson authored
      This patch fixes bugzilla bug #590878: GFS2: recovery stuck on
      transaction lock.  We set the frozen flag on the glock when we receive
      a completion that cannot be delivered due to blocked locks. At that
      point we check to see whether the first waiting holder has the noexp
      flag set. If the noexp lock is queued later, then we need to unfreeze
      the glock at that point in time, namely, in the glock work function.
      
      This patch was originally written by Steve Whitehouse, but since
      he's on holiday, I'm submitting it.  It's been well tested with a
      complex recovery test called revolver.
      Signed-off-by: default avatarSteve Whitehouse <swhiteho@redhat.com>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      b7dc2df5
    • Bob Peterson's avatar
      GFS2: O_TRUNC not working on stuffed files across cluster · a8bf2bc2
      Bob Peterson authored
      This patch replaces a statement that got dropped out by accident.
      Without the patch, truncates on stuffed (very small) files cause
      those files to have an unpredictable size.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      a8bf2bc2
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/rmk/linux-2.6-arm · 2f7989ef
      Linus Torvalds authored
      * master.kernel.org:/home/rmk/linux-2.6-arm:
        ARM: 6226/1: fix kprobe bug in ldr instruction emulation
        ARM: Update mach-types
        ARM: lockdep: fix unannotated irqs-on
        ARM: 6184/2: ux500: use neutral PRCMU base
        ARM: 6212/1: atomic ops: add memory constraints to inline asm
        ARM: 6211/1: atomic ops: fix register constraints for atomic64_add_unless
        ARM: 6210/1: Do not rely on reset defaults of L2X0_AUX_CTRL
      2f7989ef
    • Linus Torvalds's avatar
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · ea4c1a7e
      Linus Torvalds authored
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc/fsl-booke: Fix address issue when using relocatable kernels
        powerpc/cpm1: Mark micropatch code/data static and __init
        powerpc/cpm1: Fix build with various CONFIG_*_UCODE_PATCH combinations
        powerpc/cpm: Reintroduce global spi_pram struct (fixes build issue)
      ea4c1a7e
  4. 14 Jul, 2010 5 commits
  5. 12 Jul, 2010 7 commits