1. 30 Nov, 2012 11 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (Fixes from Andrew) · 50a53bbe
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "Seven fixes, some of them fingers-crossed :("
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (7 patches)
        drivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove()
        mm: soft offline: split thp at the beginning of soft_offline_page()
        mm: avoid waking kswapd for THP allocations when compaction is deferred or contended
        revert "Revert "mm: remove __GFP_NO_KSWAPD""
        mm: vmscan: fix endless loop in kswapd balancing
        mm/vmemmap: fix wrong use of virt_to_page
        mm: compaction: fix return value of capture_free_page()
      50a53bbe
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 73efd00d
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "These are three fixes for the Marvell EBU family and one for the
        Samsung s3c platforms.  All of them are obvious should still make it
        into 3.7."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: Kirkwood: Update PCI-E fixup
        Dove: Fix irq_to_pmu()
        Dove: Attempt to fix PMU/RTC interrupts
        ARM: S3C24XX: Fix potential NULL pointer dereference error
      73efd00d
    • Linus Torvalds's avatar
      Merge tag 'ixp4xx-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 90bf80a1
      Linus Torvalds authored
      Pull ARM ixp4xx bug fixes from Arnd Bergmann:
       "These were originally prepared by Krzysztof Halasa but not submitted
        in time for v3.7 due to some confusion about how ixp4xx patches should
        be handled.  Jason Cooper thankfully offered to help out sending the
        patches upstream through arm-soc now, but given the timing, we could
        as well delay them for 3.8."
      
      * tag 'ixp4xx-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        IXP4xx: use __iomem for MMIO
        IXP4xx: map CPU config registers within VMALLOC region.
        IXP4xx: Always ioremap() Queue Manager MMIO region at boot.
        ixp4xx: Declare MODULE_FIRMWARE usage
        IXP4xx crypto: MOD_AES{128,192,256} already include key size.
        WAN: Remove redundant HDLC info printed by IXP4xx HSS driver.
        IXP4xx: Remove time limit for PCI TRDY to enable use of slow devices.
        IXP4xx: ixp4xx_crypto driver requires Queue Manager and NPE drivers.
        IXP4xx: HW pseudo-random generator is available on IXP45x/46x only.
        IXP4xx: Fix off-by-one bug in Goramo MultiLink platform.
        IXP4xx: Fix Goramo MultiLink platform compilation.
      90bf80a1
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm · 50a561ca
      Linus Torvalds authored
      Pull final ARM fix from Russell King:
       "One final fix, spotted by Will, to do with what happens when we boot a
        SMP kernel on UP."
      
      * 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: 7586/1: sp804: set cpumask to cpu_possible_mask for clock event device
      50a561ca
    • Kim, Milo's avatar
      drivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove() · 1430e178
      Kim, Milo authored
      The tps65910_rtc data is registered as the platform driver data in
      _probe(= ).  Therefore the tps65910_rtc should be used on unregistering
      the rtc device.  And device pointer should be retrieved from the
      platform_device structure.
      
      This patch fixes the below oops:
      
       Unable to handle kernel NULL pointer dereference at virtual address 00000008
       Modules linked in: rtc_tps65910(-)
       CPU: 0    Not tainted  (3.7.0-rc7-next-20121128-g6b1f974-dirty #7)
       PC is at tps65910_rtc_alarm_irq_enable+0x20/0x2c [rtc_tps65910]
           (tps65910_rtc_alarm_irq_enable+0x20/0x2c [rtc_tps65910])
           (tps65910_rtc_remove+0x18/0x28 [rtc_tps65910])
           (platform_drv_remove+0x18/0x1c)
           (__device_release_driver+0x70/0xcc)
           (driver_detach+0xb4/0xb8)
           (bus_remove_driver+0x7c/0xc0)
           (sys_delete_module+0x148/0x21c)
      Signed-off-by: default avatarMilo(Woogyom) Kim <milo.kim@ti.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1430e178
    • Naoya Horiguchi's avatar
      mm: soft offline: split thp at the beginning of soft_offline_page() · 783657a7
      Naoya Horiguchi authored
      When we try to soft-offline a thp tail page, put_page() is called on the
      tail page unthinkingly and VM_BUG_ON is triggered in put_compound_page().
      
      This patch splits thp before going into the main body of soft-offlining.
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Andi Kleen <andi.kleen@intel.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      783657a7
    • Mel Gorman's avatar
      mm: avoid waking kswapd for THP allocations when compaction is deferred or contended · 782fd304
      Mel Gorman authored
      With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
      based on failures" reverted, Zdenek Kabelac reported the following
      
        Hmm,  so it's just took longer to hit the problem and observe
        kswapd0 spinning on my CPU again - it's not as endless like before -
        but still it easily eats minutes - it helps to turn off  Firefox
        or TB  (memory hungry apps) so kswapd0 stops soon - and restart
        those apps again.  (And I still have like >1GB of cached memory)
      
        kswapd0         R  running task        0    30      2 0x00000000
        Call Trace:
          preempt_schedule+0x42/0x60
          _raw_spin_unlock+0x55/0x60
          put_super+0x31/0x40
          drop_super+0x22/0x30
          prune_super+0x149/0x1b0
          shrink_slab+0xba/0x510
      
      The sysrq+m indicates the system has no swap so it'll never reclaim
      anonymous pages as part of reclaim/compaction.  That is one part of the
      problem but not the root cause as file-backed pages could also be
      reclaimed.
      
      The likely underlying problem is that kswapd is woken up or kept awake
      for each THP allocation request in the page allocator slow path.
      
      If compaction fails for the requesting process then compaction will be
      deferred for a time and direct reclaim is avoided.  However, if there
      are a storm of THP requests that are simply rejected, it will still be
      the the case that kswapd is awake for a prolonged period of time as
      pgdat->kswapd_max_order is updated each time.  This is noticed by the
      main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
      it will loopp, shrinking a small number of pages and calling
      shrink_slab() on each iteration.
      
      This patch defers when kswapd gets woken up for THP allocations.  For
      !THP allocations, kswapd is always woken up.  For THP allocations,
      kswapd is woken up iff the process is willing to enter into direct
      reclaim/compaction.
      
      [akpm@linux-foundation.org: fix typo in comment]
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Glauber Costa <glommer@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      782fd304
    • Andrew Morton's avatar
      revert "Revert "mm: remove __GFP_NO_KSWAPD"" · a5091539
      Andrew Morton authored
      It apepars that this patch was innocent, and we hope that "mm: avoid
      waking kswapd for THP allocations when compaction is deferred or
      contended" will fix the final kswapd-spinning cause.
      
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5091539
    • Johannes Weiner's avatar
      mm: vmscan: fix endless loop in kswapd balancing · 60cefed4
      Johannes Weiner authored
      Kswapd does not in all places have the same criteria for a balanced
      zone.  Zones are only being reclaimed when their high watermark is
      breached, but compaction checks loop over the zonelist again when the
      zone does not meet the low watermark plus two times the size of the
      allocation.  This gets kswapd stuck in an endless loop over a small
      zone, like the DMA zone, where the high watermark is smaller than the
      compaction requirement.
      
      Add a function, zone_balanced(), that checks the watermark, and, for
      higher order allocations, if compaction has enough free memory.  Then
      use it uniformly to check for balanced zones.
      
      This makes sure that when the compaction watermark is not met, at least
      reclaim happens and progress is made - or the zone is declared
      unreclaimable at some point and skipped entirely.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarGeorge Spelvin <linux@horizon.com>
      Reported-by: default avatarJohannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
      Reported-by: default avatarTomas Racek <tracek@redhat.com>
      Tested-by: default avatarJohannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      60cefed4
    • Jianguo Wu's avatar
      mm/vmemmap: fix wrong use of virt_to_page · ae64ffca
      Jianguo Wu authored
      I enable CONFIG_DEBUG_VIRTUAL and CONFIG_SPARSEMEM_VMEMMAP, when doing
      memory hotremove, there is a kernel BUG at arch/x86/mm/physaddr.c:20.
      
      It is caused by free_section_usemap()->virt_to_page(), virt_to_page() is
      only used for kernel direct mapping address, but sparse-vmemmap uses
      vmemmap address, so it is going wrong here.
      
        ------------[ cut here ]------------
        kernel BUG at arch/x86/mm/physaddr.c:20!
        invalid opcode: 0000 [#1] SMP
        Modules linked in: acpihp_drv acpihp_slot edd cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse vfat fat loop dm_mod coretemp kvm crc32c_intel ipv6 ixgbe igb iTCO_wdt i7core_edac edac_core pcspkr iTCO_vendor_support ioatdma microcode joydev sr_mod i2c_i801 dca lpc_ich mfd_core mdio tpm_tis i2c_core hid_generic tpm cdrom sg tpm_bios rtc_cmos button ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
        CPU 39
        Pid: 6454, comm: sh Not tainted 3.7.0-rc1-acpihp-final+ #45 QCI QSSC-S4R/QSSC-S4R
        RIP: 0010:[<ffffffff8103c908>]  [<ffffffff8103c908>] __phys_addr+0x88/0x90
        RSP: 0018:ffff8804440d7c08  EFLAGS: 00010006
        RAX: 0000000000000006 RBX: ffffea0012000000 RCX: 000000000000002c
        ...
      Signed-off-by: default avatarJianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Reviewd-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae64ffca
    • Mel Gorman's avatar
      mm: compaction: fix return value of capture_free_page() · 58d00209
      Mel Gorman authored
      Commit ef6c5be6 ("fix incorrect NR_FREE_PAGES accounting (appears
      like memory leak)") fixes a NR_FREE_PAGE accounting leak but missed the
      return value which was also missed by this reviewer until today.
      
      That return value is used by compaction when adding pages to a list of
      isolated free pages and without this follow-up fix, there is a risk of
      free list corruption.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      58d00209
  2. 29 Nov, 2012 2 commits
    • Arnd Bergmann's avatar
      Merge branch 'v3.7-samsung-fixes-4' of... · 9434d24b
      Arnd Bergmann authored
      Merge branch 'v3.7-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes
      
      From Kukjin Kim <kgene.kim@samsung.com>:
      
      Samsung fixes for v3.7
      
      * 'v3.7-samsung-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
        ARM: S3C24XX: Fix potential NULL pointer dereference error
      
      This would have been ok to delay to 3.8 according to Kukjin, but since
      it's an obvious bug fix and a potential NULL pointer dereference, it
      seem appropriate for a late 3.7 submission.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      9434d24b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e9296e89
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Some more fixes trickled in over the past few days:
      
         1) PIM device names can overflow the IFNAMSIZ buffer unless we
            properly limit the allowed indexes, fix from Eric Dumazet.
      
         2) Under heavy load we can OOPS in icmp reply processing due to an
            unchecked inet_putpeer() call.  Fix from Neal Cardwell.
      
         3) SCTP round trip calculations need to use 64-bit math to avoid
            overflows, fix from Schoch Christian.
      
         4) Fix a memory leak and an error return flub in SCTP and IRDA
            triggerable by userspace.  Fix from Tommi Rantala and found by the
            syscall fuzzer (trinity).
      
         5) MLX4 driver gives bogus size to memcpy() call, fix from Amir
            Vadai.
      
         6) Fix length calculation in VHOST descriptor translation, from
            Michael S Tsirkin.
      
         7) Ambassador ATM driver loops forever while loading firmware, fix
            from Dan Carpenter.
      
         8) Over MTU packets in openvswitch warn about wrong device, fix from
            Jesse Gross.
      
         9) Netfilter IPSET's netlink code can overrun a string buffer because
            it's not properly limited to IFNAMSIZ.  Fix from Florian Westphal.
      
        10) PCAN USB driver sets wrong timestamp in SKB, from Oliver Hartkopp.
      
        11) Make sure the RX ifindex always has a valid value in the CAN BCM
            driver, even if we haven't received a frame yet.  Fix also from
            Oliver Hartkopp."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        team: fix hw_features setup
        atm: forever loop loading ambassador firmware
        vhost: fix length for cross region descriptor
        irda: irttp: fix memory leak in irttp_open_tsap() error path
        net: qmi_wwan: add Huawei E173
        net/mlx4_en: Can set maxrate only for TC0
        sctp: Error in calculation of RTTvar
        sctp: fix -ENOMEM result with invalid user space pointer in sendto() syscall
        sctp: fix memory leak in sctp_datamsg_from_user() when copy from user space fails
        net: ipmr: limit MRT_TABLE identifiers
        ipv4: avoid passing NULL to inet_putpeer() in icmpv4_xrlim_allow()
        can: bcm: initialize ifindex for timeouts without previous frame reception
        can: peak_usb: fix hwtstamp assignment
        netfilter: ipset: fix netiface set name overflow
        openvswitch: Store flow key len if ARP opcode is not request or reply.
        openvswitch: Print device when warning about over MTU packets.
      e9296e89
  3. 28 Nov, 2012 13 commits
  4. 27 Nov, 2012 14 commits
    • Linus Torvalds's avatar
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · e23739b4
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "For some media fixes:
         - dvb_usb_v2: some fixes at the core
         - Some fixes on some embedded drivers: soc_camera, adv7604, omap3isp,
           exynos/s5p
         - Several Exynos4/5 camera fixes
         - a fix at stv0900 driver
         - a few USB ID additions to detect more variants of rtl28xxu-based
           sticks"
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (25 commits)
        [media] rtl28xxu: 0ccd:00d7 TerraTec Cinergy T Stick+
        [media] rtl28xxu: 1d19:1102 Dexatek DK mini DVB-T Dongle
        [media] mt9v022: fix the V4L2_CID_EXPOSURE control
        [media] mx2_camera: fix missing unlock on error in mx2_start_streaming()
        [media] media: omap1_camera: fix const cropping related warnings
        [media] media: mx1_camera: use the default .set_crop() implementation
        [media] media: mx2_camera: fix const cropping related warnings
        [media] media: mx3_camera: fix const cropping related warnings
        [media] media: pxa_camera: fix const cropping related warnings
        [media] media: sh_mobile_ceu_camera: fix const cropping related warnings
        [media] media: sh_vou: fix const cropping related warnings
        [media] adv7604: restart STDI once if format is not found
        [media] adv7604: use presets where possible
        [media] adv7604: Replace prim_mode by mode
        [media] adv7604: cleanup references
        [media] dvb_usb_v2: switch interruptible mutex to normal
        [media] dvb_usb_v2: fix pid_filter callback error logging
        [media] exynos-gsc: change driver compatible string
        [media] omap3isp: Fix warning caused by bad subdev events operations prototypes
        [media] omap3isp: video: Fix warning caused by bad vidioc_s_crop prototype
        ...
      e23739b4
    • Linus Torvalds's avatar
      Merge branch 'akpm' (Fixes from Andrew) · 2844a487
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "8 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (8 patches)
        futex: avoid wake_futex() for a PI futex_q
        watchdog: using u64 in get_sample_period()
        writeback: put unused inodes to LRU after writeback completion
        mm: vmscan: check for fatal signals iff the process was throttled
        Revert "mm: remove __GFP_NO_KSWAPD"
        proc: check vma->vm_file before dereferencing
        UAPI: strip the _UAPI prefix from header guards during header installation
        include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID
      2844a487
    • Linus Torvalds's avatar
      Merge tag 'tty-3.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 5687100a
      Linus Torvalds authored
      Pull TTY fix from Greg Kroah-Hartman:
       "Here is a single fix for a reported regression in 3.7-rc5 for the tty
        layer.  This fix has been in the linux-next tree and solves the
        reported problem.
      
        Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
      
      * tag 'tty-3.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty vt: Fix a regression in command line edition
      5687100a
    • Linus Torvalds's avatar
      Merge tag 'mfd-for-linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 · c854539d
      Linus Torvalds authored
      Pull MFD fixes from Samuel Ortiz:
      
       - A twl fix preventing a buffer overflow.
      
       - A wm5102 register patch fix.
      
       - A wm5110 error misreport fix.
      
       - Arizona fixes: Use the right array size when adding subdevices,
         correctly report underclocked events, synchronize register cache
         after reset.
      
       - A twl4030 fix for preventing the system to hang from an interrupt
         flood.
      
      * tag 'mfd-for-linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
        mfd: twl4030: Fix chained irq handling on resume from suspend
        mfd: arizona: Sync regcache after reset
        mfd: arizona: Correctly report when AIF2/AIF1 is underclocked
        mfd: arizona: Use correct array for ARRAY_SIZE in mfd_add_devices call
        mfd: wm5110: Disable control interface error report for WM5110 rev B
        mfd: wm5102: Update register patch for latest evaluation
        mfd: twl-core: Fix chip ID for the twl6030-pwm module
      c854539d
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm · 33057692
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "Not much here, just a couple minor/cosmetic fixes and a patch for the
        decompressor which fixes problems with modern GCC and CPUs."
      
      * 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: 7583/1: decompressor: Enable unaligned memory access for v6 and above
        ARM: 7572/1: proc-v6.S: fix comment
        ARM: 7570/1: quiet down the non make -s output
      33057692
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 87726c33
      Linus Torvalds authored
      Pull ext3 regression fix from Jan Kara:
       "Fix an ext3 regression introduced during 3.7 merge window.  It leads
        to deadlock if you stress the filesystem in the right way (luckily
        only if blocksize < pagesize)."
      
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        jbd: Fix lock ordering bug in journal_unmap_buffer()
      87726c33
    • Darren Hart's avatar
      futex: avoid wake_futex() for a PI futex_q · aa10990e
      Darren Hart authored
      Dave Jones reported a bug with futex_lock_pi() that his trinity test
      exposed.  Sometime between queue_me() and taking the q.lock_ptr, the
      lock_ptr became NULL, resulting in a crash.
      
      While futex_wake() is careful to not call wake_futex() on futex_q's with
      a pi_state or an rt_waiter (which are either waiting for a
      futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and
      futex_requeue() do not perform the same test.
      
      Update futex_wake_op() and futex_requeue() to test for q.pi_state and
      q.rt_waiter and abort with -EINVAL if detected.  To ensure any future
      breakage is caught, add a WARN() to wake_futex() if the same condition
      is true.
      
      This fix has seen 3 hours of testing with "trinity -c futex" on an
      x86_64 VM with 4 CPUS.
      
      [akpm@linux-foundation.org: tidy up the WARN()]
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Reported-by: default avatarDave Jones <davej@redat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa10990e
    • Chuansheng Liu's avatar
      watchdog: using u64 in get_sample_period() · 8ffeb9b0
      Chuansheng Liu authored
      In get_sample_period(), unsigned long is not enough:
      
        watchdog_thresh * 2 * (NSEC_PER_SEC / 5)
      
      case1:
        watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800
      
      case2:
       set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000
      
      In case2, we need use u64 to express the sample period.  Otherwise,
      changing the threshold thru proc often can not be successful.
      Signed-off-by: default avatarliu chuansheng <chuansheng.liu@intel.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8ffeb9b0
    • Jan Kara's avatar
      writeback: put unused inodes to LRU after writeback completion · 4eff96dd
      Jan Kara authored
      Commit 169ebd90 ("writeback: Avoid iput() from flusher thread")
      removed iget-iput pair from inode writeback.  As a side effect, inodes
      that are dirty during iput_final() call won't be ever added to inode LRU
      (iput_final() doesn't add dirty inodes to LRU and later when the inode
      is cleaned there's noone to add the inode there).  Thus inodes are
      effectively unreclaimable until someone looks them up again.
      
      The practical effect of this bug is limited by the fact that inodes are
      pinned by a dentry for long enough that the inode gets cleaned.  But
      still the bug can have nasty consequences leading up to OOM conditions
      under certain circumstances.  Following can easily reproduce the
      problem:
      
        for (( i = 0; i < 1000; i++ )); do
          mkdir $i
          for (( j = 0; j < 1000; j++ )); do
            touch $i/$j
            echo 2 > /proc/sys/vm/drop_caches
          done
        done
      
      then one needs to run 'sync; ls -lR' to make inodes reclaimable again.
      
      We fix the issue by inserting unused clean inodes into the LRU after
      writeback finishes in inode_sync_complete().
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reported-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: <stable@vger.kernel.org>		[3.5+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4eff96dd
    • Mel Gorman's avatar
      mm: vmscan: check for fatal signals iff the process was throttled · 50694c28
      Mel Gorman authored
      Commit 5515061d ("mm: throttle direct reclaimers if PF_MEMALLOC
      reserves are low and swap is backed by network storage") introduced a
      check for fatal signals after a process gets throttled for network
      storage.  The intention was that if a process was throttled and got
      killed that it should not trigger the OOM killer.  As pointed out by
      Minchan Kim and David Rientjes, this check is in the wrong place and too
      broad.  If a system is in am OOM situation and a process is exiting, it
      can loop in __alloc_pages_slowpath() and calling direct reclaim in a
      loop.  As the fatal signal is pending it returns 1 as if it is making
      forward progress and can effectively deadlock.
      
      This patch moves the fatal_signal_pending() check after throttling to
      throttle_direct_reclaim() where it belongs.  If the process is killed
      while throttled, it will return immediately without direct reclaim
      except now it will have TIF_MEMDIE set and will use the PFMEMALLOC
      reserves.
      
      Minchan pointed out that it may be better to direct reclaim before
      returning to avoid using the reserves because there may be pages that
      can easily reclaim that would avoid using the reserves.  However, we do
      no such targetted reclaim and there is no guarantee that suitable pages
      are available.  As it is expected that this throttling happens when
      swap-over-NFS is used there is a possibility that the process will
      instead swap which may allocate network buffers from the PFMEMALLOC
      reserves.  Hence, in the swap-over-nfs case where a process can be
      throtted and be killed it can use the reserves to exit or it can
      potentially use reserves to swap a few pages and then exit.  This patch
      takes the option of using the reserves if necessary to allow the process
      exit quickly.
      
      If this patch passes review it should be considered a -stable candidate
      for 3.6.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Sonny Rao <sonnyrao@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      50694c28
    • Mel Gorman's avatar
      Revert "mm: remove __GFP_NO_KSWAPD" · 82b212f4
      Mel Gorman authored
      With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
      based on failures" reverted, Zdenek Kabelac reported the following
      
        Hmm,  so it's just took longer to hit the problem and observe
        kswapd0 spinning on my CPU again - it's not as endless like before -
        but still it easily eats minutes - it helps to	turn off  Firefox
        or TB  (memory hungry apps) so kswapd0 stops soon - and restart
        those apps again.  (And I still have like >1GB of cached memory)
      
        kswapd0         R  running task        0    30      2 0x00000000
        Call Trace:
          preempt_schedule+0x42/0x60
          _raw_spin_unlock+0x55/0x60
          put_super+0x31/0x40
          drop_super+0x22/0x30
          prune_super+0x149/0x1b0
          shrink_slab+0xba/0x510
      
      The sysrq+m indicates the system has no swap so it'll never reclaim
      anonymous pages as part of reclaim/compaction.  That is one part of the
      problem but not the root cause as file-backed pages could also be
      reclaimed.
      
      The likely underlying problem is that kswapd is woken up or kept awake
      for each THP allocation request in the page allocator slow path.
      
      If compaction fails for the requesting process then compaction will be
      deferred for a time and direct reclaim is avoided.  However, if there
      are a storm of THP requests that are simply rejected, it will still be
      the the case that kswapd is awake for a prolonged period of time as
      pgdat->kswapd_max_order is updated each time.  This is noticed by the
      main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
      it will loopp, shrinking a small number of pages and calling
      shrink_slab() on each iteration.
      
      The temptation is to supply a patch that checks if kswapd was woken for
      THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
      backed up by proper testing.  As 3.7 is very close to release and this
      is not a bug we should release with, a safer path is to revert "mm:
      remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
      out the balance_pgdat() logic in general.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      82b212f4
    • Stanislav Kinsbursky's avatar
      proc: check vma->vm_file before dereferencing · 05f56484
      Stanislav Kinsbursky authored
      Commit 7b540d06 ("proc_map_files_readdir(): don't bother with
      grabbing files") switched proc_map_files_readdir() to use @f_mode
      directly instead of grabbing @file reference, but same time the test for
      @vm_file presence was lost leading to nil dereference.  The patch brings
      the test back.
      
      The all proc_map_files feature is CONFIG_CHECKPOINT_RESTORE wrapped
      (which is set to 'n' by default) so the bug doesn't affect regular
      kernels.
      
      The regression is 3.7-rc1 only as far as I can tell.
      
      [gorcunov@openvz.org: provided changelog]
      Signed-off-by: default avatarStanislav Kinsbursky <skinsbursky@parallels.com>
      Acked-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      05f56484
    • David Howells's avatar
      UAPI: strip the _UAPI prefix from header guards during header installation · 56c176c9
      David Howells authored
      Strip the _UAPI prefix from header guards during header installation so
      that any userspace dependencies aren't affected.  glibc, for example,
      checks for linux/types.h, linux/kernel.h, linux/compiler.h and
      linux/list.h by their guards - though the last two aren't actually
      exported.
      
        libtool: compile:  gcc -std=gnu99 -DHAVE_CONFIG_H -I. -Wall -Werror -Wformat -Wformat-security -D_FORTIFY_SOURCE=2 -fno-delete-null-pointer-checks -fstack-protector -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables -c child.c  -fPIC -DPIC -o .libs/child.o
        In file included from cli.c:20:0:
        common.h:152:8: error: redefinition of 'struct sysinfo'
        In file included from /usr/include/linux/kernel.h:4:0,
        		 from /usr/include/linux/sysctl.h:25,
        		 from /usr/include/sys/sysctl.h:43,
        		 from common.h:50,
        		 from cli.c:20:
        /usr/include/linux/sysinfo.h:7:8: note: originally defined here
      Reported-by: default avatarTomasz Torcz <tomek@pipebreaker.pl>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarJosh Boyer <jwboyer@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      56c176c9
    • Tushar Behera's avatar
      include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID · c5782e9f
      Tushar Behera authored
      Commit baf05aa9 ("bug: introduce BUILD_BUG_ON_INVALID() macro")
      introduces this macro only when _CHECKER_ is not defined.  Define a
      silent macro in the else condition to fix following sparse warning:
      
        mm/filemap.c:395:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        mm/filemap.c:396:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        mm/filemap.c:397:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        include/linux/mm.h:419:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        include/linux/mm.h:419:9: error: not a function <noident>
      Signed-off-by: default avatarTushar Behera <tushar.behera@linaro.org>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c5782e9f