1. 06 Dec, 2012 2 commits
  2. 05 Dec, 2012 3 commits
  3. 04 Dec, 2012 10 commits
    • Thomas Gleixner's avatar
      watchdog: Fix CPU hotplug regression · 8d451690
      Thomas Gleixner authored
      Norbert reported:
      "3.7-rc6 booted with nmi_watchdog=0 fails to suspend to RAM or
       offline CPUs. It's reproducable with a KVM guest and physical
       system."
      
      The reason is that commit bcd951cf(watchdog: Use hotplug thread
      infrastructure) missed to take this into account. So the cpu offline
      code gets stuck in the teardown function because it accesses non
      initialized data structures.
      
      Add a check for watchdog_enabled into that path to cure the issue.
      Reported-and-tested-by: default avatarNorbert Warmuth <nwarmuth@t-online.de>
      Tested-by: default avatarJoseph Salisbury <joseph.salisbury@canonical.com>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1211231033230.2701@ionos
      Link: http://bugs.launchpad.net/bugs/1079534Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      8d451690
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · df2fc246
      Linus Torvalds authored
      Pull module fixes from Rusty Russell:
       "Module signing build fixes for blackfin and metag"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        modsign: add symbol prefix to certificate list
        linux/kernel.h: define SYMBOL_PREFIX
      df2fc246
    • Linus Torvalds's avatar
      Merge tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi · 70dcc535
      Linus Torvalds authored
      Pull UBI changes from Artem Bityutskiy:
       "Fixes for 2 brown-paperbag bugs introduced this merge window by the
        fastmap code:
      
         1.  The UBI background thread got stuck when a bit-flip happened
             because free LEBs was not removed from the "free" tree when we
             started using it.
         2.  I/O debugging checks did not work because we called a sleeping
             function in atomic context."
      
      * tag 'upstream-3.7-rc9' of git://git.infradead.org/linux-ubi:
        UBI: dont call ubi_self_check_all_ff() in __wl_get_peb()
        UBI: remove PEB from free tree in get_peb_for_wl()
      70dcc535
    • Linus Torvalds's avatar
      Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · ca50496e
      Linus Torvalds authored
      Pull workqueue fixes from Tejun Heo:
       "So, safe fixes my ass.
      
        Commit 8852aac2 ("workqueue: mod_delayed_work_on() shouldn't queue
        timer on 0 delay") had the side-effect of performing delayed_work
        sanity checks even when @delay is 0, which should be fine for any sane
        use cases.
      
        Unfortunately, megaraid was being overly ingenious.  It seemingly
        wanted to use cancel_delayed_work_sync() before cancel_work_sync() was
        introduced, but didn't want to waste the space for full delayed_work
        as it was only going to use 0 @delay.  So, it only allocated space for
        struct work_struct and then cast it to struct delayed_work and passed
        it into delayed_work functions - truly awesome engineering tradeoff to
        save some bytes.
      
        Xiaotian fixed it by making megraid allocate full delayed_work for
        now.  It should be converted to use work_struct and cancel_work_sync()
        but I think we better do that after 3.7.
      
        I added another commit to change BUG_ON()s in __queue_delayed_work()
        to WARN_ON_ONCE()s so that the kernel doesn't crash even if there are
        more such abuses."
      
      * 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s
        megaraid: fix BUG_ON() from incorrect use of delayed work
      ca50496e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 609e3ff3
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "Two small fixes for Sparc, nobody uses sparc, so these are low risk :-)
      
         1) Piggyback is too picky about the symbol types that _start and _end
            have in the final kernel image, and it thus breaks with newer
            binutils.  Future proof by getting rid of the symbol type checks.
      
         2) exit_group() should kill register windows on sparc64 the same way
            we do for plain exit().  Thanks to Al Viro for spotting this."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Fix piggyback with newer binutils.
        sparc64: exit_group should kill register windows just like plain exit.
      609e3ff3
    • Linus Torvalds's avatar
      vfs: avoid "attempt to access beyond end of device" warnings · 57302e0d
      Linus Torvalds authored
      The block device access simplification that avoided accessing the (racy)
      block size information (commit bbec0270: "blkdev_max_block: make
      private to fs/buffer.c") no longer checks the maximum block size in the
      block mapping path.
      
      That was _almost_ as simple as just removing the code entirely, because
      the readers and writers all check the size of the device anyway, so
      under normal circumstances it "just worked".
      
      However, the block size may be such that the end of the device may
      straddle one single buffer_head.  At which point we may still want to
      access the end of the device, but the buffer we use to access it
      partially extends past the end.
      
      The 'bd_set_size()' function intentionally sets the block size to avoid
      this, but mounting the device - or setting the block size by hand to
      some other value - can modify that block size.
      
      So instead, teach 'submit_bh()' about the special case of the buffer
      head straddling the end of the device, and turning such an access into a
      smaller IO access, avoiding the problem.
      
      This, btw, also means that unlike before, we can now access the whole
      device regardless of device block size setting.  So now, even if the
      device size is only 512-byte aligned, we can read and write even the
      last sector even when having a much bigger block size for accessing the
      rest of the device.
      
      So with this, we could now get rid of the 'bd_set_size()' block size
      code entirely - resulting in faster IO for the common case - but that
      would be a separate patch.
      Reported-and-tested-by: default avatarRomain Francoise <romain@orebokech.com>
      Reporeted-and-tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Reported-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57302e0d
    • Tejun Heo's avatar
      workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s · fc4b514f
      Tejun Heo authored
      8852aac2 ("workqueue: mod_delayed_work_on() shouldn't queue timer on
      0 delay") unexpectedly uncovered a very nasty abuse of delayed_work in
      megaraid - it allocated work_struct, casted it to delayed_work and
      then pass that into queue_delayed_work().
      
      Previously, this was okay because 0 @delay short-circuited to
      queue_work() before doing anything with delayed_work.  8852aac2
      moved 0 @delay test into __queue_delayed_work() after sanity check on
      delayed_work making megaraid trigger BUG_ON().
      
      Although megaraid is already fixed by c1d390d8 ("megaraid: fix
      BUG_ON() from incorrect use of delayed work"), this patch converts
      BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s so that such
      abusers, if there are more, trigger warning but don't crash the
      machine.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Xiaotian Feng <xtfeng@gmail.com>
      fc4b514f
    • Xiaotian Feng's avatar
      megaraid: fix BUG_ON() from incorrect use of delayed work · c1d390d8
      Xiaotian Feng authored
      megaraid use INIT_WORK to declare a hotplug_work, but cast the
      hotplug_work from work_struct to delayed_work and
      schedule_delayed_work on it.  This is very dangerous, as other part of
      delayed_work might be kernel memories allocated by others.
      
      With commit 8852aac2 ("workqueue: mod_delayed_work_on() shouldn't queue
      timer on 0 delay"), schedule_delayed_work() will check dwork->timer
      before queue_work even when @delay is 0, this causes megaraid code to
      hit the BUG_ON() in workqueue code.  Change megaraid code to use
      delayed work.
      Signed-off-by: default avatarXiaotian Feng <dannyfeng@tencent.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Neela Syam Kolli <megaraidlinux@lsi.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: linux-scsi@vger.kernel.org
      c1d390d8
    • Richard Weinberger's avatar
      UBI: dont call ubi_self_check_all_ff() in __wl_get_peb() · 894aef21
      Richard Weinberger authored
      As ubi_self_check_all_ff() might sleep we are not allowed
      to call it from atomic context.
      For now we call it only from ubi_wl_get_peb().
      There are some code paths where it would also make sense,
      but these paths are currently atomic and only enabled
      when fastmap is used.
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      894aef21
    • Richard Weinberger's avatar
      UBI: remove PEB from free tree in get_peb_for_wl() · ed4b7021
      Richard Weinberger authored
      If UBI is built without fastmap, get_peb_for_wl() has to
      remove the PEB manially from the free tree.
      Otherwise the requested PEB lives in two trees.
      Reported-by: default avatarZach Sadecki <zsadecki@itwatchdogs.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      ed4b7021
  4. 03 Dec, 2012 12 commits
    • David S. Miller's avatar
      sparc: Fix piggyback with newer binutils. · 0032c857
      David S. Miller authored
      Newer versions of binutils mark '_end' as 'B' instead of 'A' for
      whatever reason.
      
      To be honest, the piggyback code doesn't actually care what kind
      of symbol _start and _end are, it just wants to find them and
      record the address.
      
      So remove the type from the match strings.
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0032c857
    • Linus Torvalds's avatar
      Linux 3.7-rc8 · b69f0859
      Linus Torvalds authored
      b69f0859
    • David S. Miller's avatar
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac · b52c6402
      Linus Torvalds authored
      Pull EDAC fixes from Mauro Carvalho Chehab:
       "One EDAC core fix, and a few driver fixes (i7300, i9275x, i7core)."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
        i7core_edac: fix panic when accessing sysfs files
        i7300_edac: Fix error flag testing
        edac: Fix the dimm filling for csrows-based layouts
        i82975x_edac: Fix dimm label initialization
      b52c6402
    • Linus Torvalds's avatar
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 4ba00329
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Some driver fixes for s5p/exynos (mostly race fixes)"
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] s5p-mfc: Handle multi-frame input buffer
        [media] s5p-mfc: Bug fix of timestamp/timecode copy mechanism
        [media] exynos-gsc: Add missing video device vfl_dir flag initialization
        [media] exynos-gsc: Fix settings for input and output image RGB type
        [media] exynos-gsc: Don't use mutex_lock_interruptible() in device release()
        [media] fimc-lite: Don't use mutex_lock_interruptible() in device release()
        [media] s5p-fimc: Don't use mutex_lock_interruptible() in device release()
        [media] s5p-fimc: Prevent race conditions during subdevs registration
      4ba00329
    • Al Viro's avatar
      [parisc] open(2) compat bug · 25a3bc6b
      Al Viro authored
      In commit 9d73fc2d ("open*(2) compat fixes (s390, arm64)") I said:
      >
      > 	The usual rules for open()/openat()/open_by_handle_at() are
      > 1) native 32bit - don't force O_LARGEFILE in flags
      > 2) native 64bit - force O_LARGEFILE in flags
      > 3) compat on 64bit host - as for native 32bit
      > 4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for native 64bit
      >
      > There are only two exceptions - s390 compat has open() forcing O_LARGEFILE and
      > arm64 compat has open_by_handle_at() doing the same thing.  The same binaries
      > on native host (s390/31 and arm resp.) will *not* force O_LARGEFILE, so IMO
      > both are emulation bugs.
      
      Three exceptions, actually - parisc open() is another case like that.
      Native 32bit won't force O_LARGEFILE, the same binary on parisc64 will.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25a3bc6b
    • Mike Galbraith's avatar
      Revert "sched, autogroup: Stop going ahead if autogroup is disabled" · fd8ef117
      Mike Galbraith authored
      This reverts commit 800d4d30.
      
      Between commits 8323f26c ("sched: Fix race in task_group()") and
      800d4d30 ("sched, autogroup: Stop going ahead if autogroup is
      disabled"), autogroup is a wreck.
      
      With both applied, all you have to do to crash a box is disable
      autogroup during boot up, then reboot..  boom, NULL pointer dereference
      due to commit 800d4d30 not allowing autogroup to move things, and
      commit 8323f26c making that the only way to switch runqueues:
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
        Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 MEDIONPC MS-7502/MS-7502
        RIP: effective_load.isra.43+0x50/0x90
        Process systemd-user-se (pid: 7047, threadinfo ffff880221dde000, task ffff88022618b3a0)
        Call Trace:
          select_task_rq_fair+0x255/0x780
          try_to_wake_up+0x156/0x2c0
          wake_up_state+0xb/0x10
          signal_wake_up+0x28/0x40
          complete_signal+0x1d6/0x250
          __send_signal+0x170/0x310
          send_signal+0x40/0x80
          do_send_sig_info+0x47/0x90
          group_send_sig_info+0x4a/0x70
          kill_pid_info+0x3a/0x60
          sys_kill+0x97/0x1a0
          ? vfs_read+0x120/0x160
          ? sys_read+0x45/0x90
          system_call_fastpath+0x16/0x1b
        Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 <48> 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c
        RIP  [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
         RSP <ffff880221ddfbd8>
        CR2: 0000000000000000
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Yong Zhang <yong.zhang0@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@vger.kernel.org # 2.6.39+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd8ef117
    • Linus Torvalds's avatar
      Merge branch 'block-dev' · d3594ea2
      Linus Torvalds authored
      Merge 'block-dev' branch.
      
      I was going to just mark everything here for stable and leave it to the
      3.8 merge window, but having decided on doing another -rc, I migth as
      well merge it now.
      
      This removes the bd_block_size_semaphore semaphore that was added in
      this release to fix a race condition between block size changes and
      block IO, and replaces it with atomicity guaratees in fs/buffer.c
      instead, along with simplifying fs/block-dev.c.
      
      This removes more lines than it adds, makes the code generally simpler,
      and avoids the latency/rt issues that the block size semaphore
      introduced for mount.
      
      I'm not happy with the timing, but it wouldn't be much better doing this
      during the merge window and then having some delayed back-port of it
      into stable.
      
      * block-dev:
        blkdev_max_block: make private to fs/buffer.c
        direct-io: don't read inode->i_blkbits multiple times
        blockdev: remove bd_block_size_semaphore again
        fs/buffer.c: make block-size be per-page and protected by the page lock
      d3594ea2
    • James Hogan's avatar
      modsign: add symbol prefix to certificate list · 84ecfd15
      James Hogan authored
      Add the arch symbol prefix (if applicable) to the asm definition of
      modsign_certificate_list and modsign_certificate_list_end. This uses the
      recently defined SYMBOL_PREFIX which is derived from
      CONFIG_SYMBOL_PREFIX.
      
      This fixes the build of module signing on the blackfin and metag
      architectures.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      84ecfd15
    • James Hogan's avatar
      linux/kernel.h: define SYMBOL_PREFIX · cbdbf2ab
      James Hogan authored
      Define SYMBOL_PREFIX to be the same as CONFIG_SYMBOL_PREFIX if set by
      the architecture, or "" otherwise. This avoids the need for ugly #ifdefs
      whenever symbols are referenced in asm blocks.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Jean Delvare <khali@linux-fr.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      cbdbf2ab
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7e5530af
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) 8139cp leaks memory in error paths, from Francois Romieu.
      
       2) do_tcp_sendpages() cannot handle order > 0 pages, but they can
          certainly arrive there now, fix from Eric Dumazet.
      
       3) Race condition and sysfs fixes in bonding from Nikolay Aleksandrov.
      
       4) Remain-on-Channel fix in mac80211 from Felix Liao.
      
       5) CCK rate calculation fix in iwlwifi, from Emmanuel Grumbach.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        8139cp: fix coherent mapping leak in error path.
        tcp: fix crashes in do_tcp_sendpages()
        bonding: fix race condition in bonding_store_slaves_active
        bonding: make arp_ip_target parameter checks consistent with sysfs
        bonding: fix miimon and arp_interval delayed work race conditions
        mac80211: fix remain-on-channel (non-)cancelling
        iwlwifi: fix the basic CCK rates calculation
      7e5530af
    • Linus Torvalds's avatar
      Merge tag 'md-3.7-fixes' of git://neil.brown.name/md · 4ccc8045
      Linus Torvalds authored
      Pull md bugfix from NeilBrown:
       "Single bugfix for raid1/raid10.
      
        Fixes a recently introduced deadlock."
      
      * tag 'md-3.7-fixes' of git://neil.brown.name/md:
        md/raid1{,0}: fix deadlock in bitmap_unplug.
      4ccc8045
  5. 02 Dec, 2012 5 commits
    • Al Viro's avatar
      open*(2) compat fixes (s390, arm64) · 9d73fc2d
      Al Viro authored
      The usual rules for open()/openat()/open_by_handle_at() are
       1) native 32bit - don't force O_LARGEFILE in flags
       2) native 64bit - force O_LARGEFILE in flags
       3) compat on 64bit host - as for native 32bit
       4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for
          native 64bit
      
      There are only two exceptions - s390 compat has open() forcing
      O_LARGEFILE and arm64 compat has open_by_handle_at() doing the same
      thing.  The same binaries on native host (s390/31 and arm resp.) will
      *not* force O_LARGEFILE, so IMO both are emulation bugs.
      
      Objections? The fix is obvious...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d73fc2d
    • Linus Torvalds's avatar
      Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 3c46f3d6
      Linus Torvalds authored
      Pull  late workqueue fixes from Tejun Heo:
       "Unfortunately, I have two really late fixes.  One was for a
        long-standing bug and queued for 3.8 but I found out about a
        regression introduced during 3.7-rc1 two days ago, so I'm sending out
        the two fixes together.
      
        The first (long-standing) one is rescuer_thread() entering exit path
        w/ TASK_INTERRUPTIBLE.  It only triggers on workqueue destructions
        which isn't very frequent and the exit path can usually survive being
        called with TASK_INTERRUPT, so it was hidden pretty well.  Apparently,
        if you're reiserfs, this could lead to the exiting kthread sleeping
        indefinitely holding a mutex, which is never good.
      
        The fix is simple - restoring TASK_RUNNING before returning from the
        kthread function.
      
        The second one is introduced by the new mod_delayed_work().
        mod_delayed_work() was missing special case handling for 0 delay.
        Instead of queueing the work item immediately, it queued the timer
        which expires on the closest next tick.  Some users of the new
        function converted from "[__]cancel_delayed_work() +
        queue_delayed_work()" combination became unhappy with the extra delay.
      
        Block unplugging led to noticeably higher number of context switches
        and intel 6250 wireless failed to associate with WPA-Enterprise
        network.  The fix, again, is fairly simple.  The 0 delay special case
        logic from queue_delayed_work_on() should be moved to
        __queue_delayed_work() which is shared by both queue_delayed_work_on()
        and mod_delayed_work_on().
      
        The first one is difficult to trigger and the failure mode for the
        latter isn't completely catastrophic, so missing these two for 3.7
        wouldn't make it a disastrous release, but both bugs are nasty and the
        fixes are fairly safe"
      
      * 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay
        workqueue: exit rescuer_thread() as TASK_RUNNING
      3c46f3d6
    • françois romieu's avatar
      8139cp: fix coherent mapping leak in error path. · 892a925e
      françois romieu authored
      cp_open
      [...]
              rc = cp_alloc_rings(cp);
              if (rc)
                      return rc;
      
      cp_alloc_rings
      [...]
              mem = dma_alloc_coherent(&cp->pdev->dev, CP_RING_BYTES,
                                       &cp->ring_dma, GFP_KERNEL);
      
      - cp_alloc_rings never frees the coherent mapping it allocates
      - neither do cp_open when cp_alloc_rings fails
      Signed-off-by: default avatarFrancois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      892a925e
    • Eric Dumazet's avatar
      tcp: fix crashes in do_tcp_sendpages() · 64022d0b
      Eric Dumazet authored
      Recent network changes allowed high order pages being used
      for skb fragments.
      
      This uncovered a bug in do_tcp_sendpages() which was assuming its caller
      provided an array of order-0 page pointers.
      
      We only have to deal with a single page in this function, and its order
      is irrelevant.
      Reported-by: default avatarWilly Tarreau <w@1wt.eu>
      Tested-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64022d0b
    • Tejun Heo's avatar
      workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay · 8852aac2
      Tejun Heo authored
      8376fe22 ("workqueue: implement mod_delayed_work[_on]()")
      implemented mod_delayed_work[_on]() using the improved
      try_to_grab_pending().  The function is later used, among others, to
      replace [__]candel_delayed_work() + queue_delayed_work() combinations.
      
      Unfortunately, a delayed_work item w/ zero @delay is handled slightly
      differently by mod_delayed_work_on() compared to
      queue_delayed_work_on().  The latter skips timer altogether and
      directly queues it using queue_work_on() while the former schedules
      timer which will expire on the closest tick.  This means, when @delay
      is zero, that [__]cancel_delayed_work() + queue_delayed_work_on()
      makes the target item immediately executable while
      mod_delayed_work_on() may induce delay of upto a full tick.
      
      This somewhat subtle difference breaks some of the converted users.
      e.g. block queue plugging uses delayed_work for deferred processing
      and uses mod_delayed_work_on() when the queue needs to be immediately
      unplugged.  The above problem manifested as noticeably higher number
      of context switches under certain circumstances.
      
      The difference in behavior was caused by missing special case handling
      for 0 delay in mod_delayed_work_on() compared to
      queue_delayed_work_on().  Joonsoo Kim posted a patch to add it -
      ("workqueue: optimize mod_delayed_work_on() when @delay == 0")[1].
      The patch was queued for 3.8 but it was described as optimization and
      I missed that it was a correctness issue.
      
      As both queue_delayed_work_on() and mod_delayed_work_on() use
      __queue_delayed_work() for queueing, it seems that the better approach
      is to move the 0 delay special handling to the function instead of
      duplicating it in mod_delayed_work_on().
      
      Fix the problem by moving 0 delay special case handling from
      queue_delayed_work_on() to __queue_delayed_work().  This replaces
      Joonsoo's patch.
      
      [1] http://thread.gmane.org/gmane.linux.kernel/1379011/focus=1379012Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-tested-by: default avatarAnders Kaseorg <andersk@MIT.EDU>
      Reported-and-tested-by: default avatarZlatko Calusic <zlatko.calusic@iskon.hr>
      LKML-Reference: <alpine.DEB.2.00.1211280953350.26602@dr-wily.mit.edu>
      LKML-Reference: <50A78AA9.5040904@iskon.hr>
      Cc: Joonsoo Kim <js1304@gmail.com>
      8852aac2
  6. 01 Dec, 2012 8 commits