1. 21 Nov, 2007 7 commits
    • Jozsef Kadlecsik's avatar
      NETFILTER: nf_conntrack_tcp: fix connection reopening · 5263c68d
      Jozsef Kadlecsik authored
      Upstream commits: 17311393 + bc34b841 merged together.  Merge done by
      Patrick McHardy <kaber@trash.net>
      
      [NETFILTER]: nf_conntrack_tcp: fix connection reopening
      
      With your description I could reproduce the bug and actually you were
      completely right: the code above is incorrect. Somehow I was able to
      misread RFC1122 and mixed the roles :-(:
      
         When a connection is >>closed actively<<, it MUST linger in
         TIME-WAIT state for a time 2xMSL (Maximum Segment Lifetime).
         However, it MAY >>accept<< a new SYN from the remote TCP to
         reopen the connection directly from TIME-WAIT state, if it:
         [...]
      
      The fix is as follows: if the receiver initiated an active close, then the
      sender may reopen the connection - otherwise try to figure out if we hold
      a dead connection.
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Tested-by: default avatarKrzysztof Piotr Oledzki <ole@ans.pl>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      5263c68d
    • Jan Kiszka's avatar
      fix param_sysfs_builtin name length check · 79d84e19
      Jan Kiszka authored
      patch 22800a28 in mainline.
      
      Commit faf8c714 caused a regression:
      parameter names longer than MAX_KBUILD_MODNAME will now be rejected,
      although we just need to keep the module name part that short.  This patch
      restores the old behaviour while still avoiding that memchr is called with
      its length parameter larger than the total string length.
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@web.de>
      Cc: Dave Young <hidave.darkstar@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      79d84e19
    • Hugh Dickins's avatar
      fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE · fed52120
      Hugh Dickins authored
      patch 487e9bf2 in mainline.
      
      It's possible to provoke unionfs (not yet in mainline, though in mm and
      some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
      it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
      2.6.24 ecryptfs no longer calls lower level's ->writepage).
      
      This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
      leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
      already a fix (e4230030 - writeback: don't
      propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
      far as it goes; but insufficient because it doesn't address the underlying
      issue, that shmem_writepage expects to be called only by vmscan (relying on
      backing_dev_info capabilities to prevent the normal writeback path from
      ever approaching it).
      
      That's an increasingly fragile assumption, and ramdisk_writepage (the other
      source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
      wbc->for_reclaim before returning it.  Make the same check in
      shmem_writepage, thereby sidestepping the page_mapped BUG also.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Cc: Erez Zadok <ezk@cs.sunysb.edu>
      Reviewed-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      fed52120
    • Andrew Morton's avatar
      writeback: don't propagate AOP_WRITEPAGE_ACTIVATE · b2e5acb6
      Andrew Morton authored
      patch e4230030 in mainline.
      
      This is a writeback-internal marker but we're propagating it all the way back
      to userspace!.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      b2e5acb6
    • Dave Johnson's avatar
      x86: fix TSC clock source calibration error · e8b00d71
      Dave Johnson authored
      patch edaf420f in mainline.
      
      I ran into this problem on a system that was unable to obtain NTP sync
      because the clock was running very slow (over 10000ppm slow). ntpd had
      declared all of its peers 'reject' with 'peer_dist' reason.
      
      On investigation, the tsc_khz variable was significantly incorrect
      causing xtime to run slow.  After a reboot tsc_khz was correct so I
      did a reboot test to see how often the problem occurred:
      
      Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
      had unacceptable tsc_khz values (>500ppm):
      
       range of tsc_khz  # of boots  % of boots
       ----------------  ----------  ----------
              < 1999750           0      0.000%
      1999750 - 1999800          21      3.048%
      1999800 - 1999850         166     24.128%
      1999850 - 1999900         241     35.029%
      1999900 - 1999950         211     30.669%
      1999950 - 2000000          42      6.105%
      2000000 - 2000000           0      0.000%
      2000050 - 2000100           0      0.000%
                         [...]
      2000100 - 2015000           1      0.145%  << BAD
      2015000 - 2030000           6      0.872%  << BAD
      2030000 - 2045000           1      0.145%  << BAD
      2045000 <                   0      0.000%
      
      The worst boot was 2032.577 Mhz, over 1.5% off!
      
      It appears that on rare occasions, mach_countup() is taking longer to
      complete than necessary.
      
      I suspect that this is caused by the CPU taking a periodic SMI
      interrupt right at the end of the 30ms calibration loop.  This would
      cause the loop to delay while the SMI BIOS hander runs. The resulting
      TSC value is beyond what it actually should be resulting in a higher
      tsc_khz.
      
      The below patch makes native_calculate_cpu_khz() take the best
      (shortest duration, lowest khz) run of it's 3 calibration loops.  If a
      SMI goes off causing a bad result (long duration, higher khz) it will
      be discarded.
      
      With the patch applied, 300 boots of the same system produce good
      results:
      
       range of tsc_khz  # of boots  % of boots
       ----------------  ----------  ----------
              < 1999750           0      0.000%
      1999750 - 1999800          30     10.000%
      1999800 - 1999850         166     55.333%
      1999850 - 1999900          89     29.667%
      1999900 - 1999950          15      5.000%
      1999950 <                   0      0.000%
      
      Problem was found and tested against 2.6.18.  Patch is against 2.6.22.
      Signed-off-by: default avatarDave Johnson <djohnson@sw.starentnetworks.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e8b00d71
    • David Miller's avatar
      Fix compat futex hangs. · 52c7418e
      David Miller authored
      [FUTEX]: Fix address computation in compat code.
      
      [ Upstream commit: 3c5fd9c7 ]
      
      compat_exit_robust_list() computes a pointer to the
      futex entry in userspace as follows:
      
      	(void __user *)entry + futex_offset
      
      'entry' is a 'struct robust_list __user *', and
      'futex_offset' is a 'compat_long_t' (typically a 's32').
      
      Things explode if the 32-bit sign bit is set in futex_offset.
      
      Type promotion sign extends futex_offset to a 64-bit value before
      adding it to 'entry'.
      
      This triggered a problem on sparc64 running 32-bit applications which
      would lock up a cpu looping forever in the fault handling for the
      userspace load in handle_futex_death().
      
      Compat userspace runs with address masking (wherein the cpu zeros out
      the top 32-bits of every effective address given to a memory operation
      instruction) so the sparc64 fault handler accounts for this by
      zero'ing out the top 32-bits of the fault address too.
      
      Since the kernel properly uses the compat_uptr interfaces, kernel side
      accesses to compat userspace work too since they will only use
      addresses with the top 32-bit clear.
      
      Because of this compat futex layer bug we get into the following loop
      when executing the get_user() load near the top of handle_futex_death():
      
      1) load from address '0xfffffffff7f16bd8', FAULT
      2) fault handler clears upper 32-bits, processes fault
         for address '0xf7f16bd8' which succeeds
      3) goto #1
      
      I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
      for their tireless efforts helping me track down this bug.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      52c7418e
    • Christoph Lameter's avatar
      SLUB: Fix memory leak by not reusing cpu_slab · 06283bea
      Christoph Lameter authored
      backport of 05aa3450 from Linus's tree.
      
      SLUB: Fix memory leak by not reusing cpu_slab
      
      Fix the memory leak that may occur when we attempt to reuse a cpu_slab
      that was allocated while we reenabled interrupts in order to be able to
      grow a slab cache. The per cpu freelist may contain objects and in that
      situation we may overwrite the per cpu freelist pointer loosing objects.
      This only occurs if we find that the concurrently allocated slab fits
      our allocation needs.
      
      If we simply always deactivate the slab then the freelist will be properly
      reintegrated and the memory leak will go away.
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      06283bea
  2. 16 Nov, 2007 3 commits
  3. 05 Nov, 2007 10 commits
  4. 02 Nov, 2007 20 commits