1. 12 Dec, 2012 21 commits
    • Tang Chen's avatar
      mm/memory_hotplug.c: update start_pfn in zone and pg_data when spanned_pages == 0. · 712cd386
      Tang Chen authored
      If we hot-remove memory only and leave the cpus alive, the corresponding
      node will not be removed.  But the node_start_pfn and node_spanned_pages
      in pg_data will be reset to 0.  In this case, when we hot-add the memory
      back next time, the node_start_pfn will always be 0 because no pfn is less
      than 0.  After that, if we hot-remove the memory again, it will cause
      kernel panic in function find_biggest_section_pfn() when it tries to scan
      all the pfns.
      
      The zone will also have the same problem.
      
      This patch sets start_pfn to the start_pfn of the section being added when
      spanned_pages of the zone or pg_data is 0.
      
        ---How to reproduce---
      
      1. hot-add a container with some memory and cpus;
      2. hot-remove the container's memory, and leave cpus there;
      3. hot-add these memory again;
      4. hot-remove them again;
      
      then, the kernel will panic.
      
        ---Call trace---
      
        BUG: unable to handle kernel paging request at 00000fff82a8cc38
        IP: [<ffffffff811c0d55>] find_biggest_section_pfn+0xe5/0x180
        ......
        Call Trace:
         [<ffffffff811c1124>] __remove_zone+0x184/0x1b0
         [<ffffffff811c11dc>] __remove_section+0x8c/0xb0
         [<ffffffff811c12e7>] __remove_pages+0xe7/0x120
         [<ffffffff81654f7c>] arch_remove_memory+0x2c/0x80
         [<ffffffff81655bb6>] remove_memory+0x56/0x90
         [<ffffffff813da0c8>] acpi_memory_device_remove_memory+0x48/0x73
         [<ffffffff813da55a>] acpi_memory_device_notify+0x153/0x274
         [<ffffffff813b6786>] acpi_ev_notify_dispatch+0x41/0x5f
         [<ffffffff813a3867>] acpi_os_execute_deferred+0x27/0x34
         [<ffffffff81090589>] process_one_work+0x219/0x680
         [<ffffffff810923be>] worker_thread+0x12e/0x320
         [<ffffffff81098396>] kthread+0xc6/0xd0
         [<ffffffff8167c7c4>] kernel_thread_helper+0x4/0x10
        ......
        ---[ end trace 96d845dbf33fee11 ]---
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      712cd386
    • Lai Jiangshan's avatar
      slub, hotplug: ignore unrelated node's hot-adding and hot-removing · b9d5ab25
      Lai Jiangshan authored
      SLUB only focuses on the nodes which have normal memory and it ignores the
      other node's hot-adding and hot-removing.
      
      Aka: if some memory of a node which has no onlined memory is online, but
      this new memory onlined is not normal memory (for example, highmem), we
      should not allocate kmem_cache_node for SLUB.
      
      And if the last normal memory is offlined, but the node still has memory,
      we should remove kmem_cache_node for that node.  (The current code delays
      it when all of the memory is offlined)
      
      So we only do something when marg->status_change_nid_normal > 0.
      marg->status_change_nid is not suitable here.
      
      The same problem doesn't exist in SLAB, because SLAB allocates kmem_list3
      for every node even the node don't have normal memory, SLAB tolerates
      kmem_list3 on alien nodes.  SLUB only focuses on the nodes which have
      normal memory, it don't tolerate alien kmem_cache_node.  The patch makes
      SLUB become self-compatible and avoids WARNs and BUGs in rare conditions.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Rob Landley <rob@landley.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9d5ab25
    • Lai Jiangshan's avatar
      memory_hotplug: fix possible incorrect node_states[N_NORMAL_MEMORY] · d9713679
      Lai Jiangshan authored
      Currently memory_hotplug only manages the node_states[N_HIGH_MEMORY], it
      forgets to manage node_states[N_NORMAL_MEMORY].  This may cause
      node_states[N_NORMAL_MEMORY] to become incorrect.
      
      Example, if a node is empty before online, and we online a memory which is
      in ZONE_NORMAL.  And after online, node_states[N_HIGH_MEMORY] is correct,
      but node_states[N_NORMAL_MEMORY] is incorrect, the online code doesn't set
      the new online node to node_states[N_NORMAL_MEMORY].
      
      The same thing will happen when offlining (the offline code doesn't clear
      the node from node_states[N_NORMAL_MEMORY] when needed).  Some memory
      managment code depends node_states[N_NORMAL_MEMORY], so we have to fix up
      the node_states[N_NORMAL_MEMORY].
      
      We add node_states_check_changes_online() and
      node_states_check_changes_offline() to detect whether
      node_states[N_HIGH_MEMORY] and node_states[N_NORMAL_MEMORY] are changed
      while hotpluging.
      
      Also add @status_change_nid_normal to struct memory_notify, thus the
      memory hotplug callbacks know whether the node_states[N_NORMAL_MEMORY] are
      changed.  (We can add a @flags and reuse @status_change_nid instead of
      introducing @status_change_nid_normal, but it will add much more
      complexity in memory hotplug callback in every subsystem.  So introducing
      @status_change_nid_normal is better and it doesn't change the sematics of
      @status_change_nid)
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Rob Landley <rob@landley.net>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d9713679
    • Wen Congyang's avatar
      memory-hotplug: allocate zone's pcp before onlining pages · 6dcd73d7
      Wen Congyang authored
      We use __free_page() to put a page to buddy system when onlining pages.
      __free_page() will store NR_FREE_PAGES in zone's pcp.vm_stat_diff, so we
      should allocate zone's pcp before onlining pages, otherwise we will lose
      some free pages.
      
      [mhocko@suse.cz: make zone_pcp_reset independent of MEMORY_HOTREMOVE]
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6dcd73d7
    • Wen Congyang's avatar
      memory-hotplug, mm/sparse.c: clear the memory to store struct page · 3ac19f8e
      Wen Congyang authored
      If sparse memory vmemmap is enabled, we can't free the memory to store
      struct page when a memory device is hotremoved, because we may store
      struct page in the memory to manage the memory which doesn't belong to
      this memory device.  When we hotadded this memory device again, we will
      reuse this memory to store struct page, and struct page may contain some
      obsolete information, and we will get bad-page state:
      
        init_memory_mapping: [mem 0x80000000-0x9fffffff]
        Built 2 zonelists in Node order, mobility grouping on.  Total pages: 547617
        Policy zone: Normal
        BUG: Bad page state in process bash  pfn:9b6dc
        page:ffffea0002200020 count:0 mapcount:0 mapping:          (null) index:0xfdfdfdfdfdfdfdfd
        page flags: 0x2fdfdfdfd5df9fd(locked|referenced|uptodate|dirty|lru|active|slab|owner_priv_1|private|private_2|writeback|head|tail|swapcache|reclaim|swapbacked|unevictable|uncached|compound_lock)
        Modules linked in: netconsole acpiphp pci_hotplug acpi_memhotplug loop kvm_amd kvm microcode tpm_tis tpm tpm_bios evdev psmouse serio_raw i2c_piix4 i2c_core parport_pc parport processor button thermal_sys ext3 jbd mbcache sg sr_mod cdrom ata_generic virtio_net ata_piix virtio_blk libata virtio_pci virtio_ring virtio scsi_mod
        Pid: 988, comm: bash Not tainted 3.6.0-rc7-guest #12
        Call Trace:
         [<ffffffff810e9b30>] ? bad_page+0xb0/0x100
         [<ffffffff810ea4c3>] ? free_pages_prepare+0xb3/0x100
         [<ffffffff810ea668>] ? free_hot_cold_page+0x48/0x1a0
         [<ffffffff8112cc08>] ? online_pages_range+0x68/0xa0
         [<ffffffff8112cba0>] ? __online_page_increment_counters+0x10/0x10
         [<ffffffff81045561>] ? walk_system_ram_range+0x101/0x110
         [<ffffffff814c4f95>] ? online_pages+0x1a5/0x2b0
         [<ffffffff8135663d>] ? __memory_block_change_state+0x20d/0x270
         [<ffffffff81356756>] ? store_mem_state+0xb6/0xf0
         [<ffffffff8119e482>] ? sysfs_write_file+0xd2/0x160
         [<ffffffff8113769a>] ? vfs_write+0xaa/0x160
         [<ffffffff81137977>] ? sys_write+0x47/0x90
         [<ffffffff814e2f25>] ? async_page_fault+0x25/0x30
         [<ffffffff814ea239>] ? system_call_fastpath+0x16/0x1b
        Disabling lock debugging due to kernel taint
      
      This patch clears the memory to store struct page to avoid unexpected error.
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reported-by: default avatarVasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ac19f8e
    • Yasuaki Ishimatsu's avatar
      memory-hotplug: suppress "Device nodeX does not have a release() function" warning · 8c7b5b4e
      Yasuaki Ishimatsu authored
      When calling unregister_node(), the function shows following message at
      device_release().
      
      "Device 'node2' does not have a release() function, it is broken and must
      be fixed."
      
      The reason is node's device struct does not have a release() function.
      
      So the patch registers node_device_release() to the device's release()
      function for suppressing the warning message.  Additionally, the patch
      adds memset() to initialize a node struct into register_node().  Because
      the node struct is part of node_devices[] array and it cannot be freed by
      node_device_release().  So if system reuses the node struct, it has a
      garbage.
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8c7b5b4e
    • Wen Congyang's avatar
      numa: convert static memory to dynamically allocated memory for per node device · 8732794b
      Wen Congyang authored
      We use a static array to store struct node.  In many cases, we don't have
      too many nodes, and some memory will be unused.  Convert it to per-device
      dynamically allocated memory.
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8732794b
    • Wen Congyang's avatar
      memory-hotplug: fix NR_FREE_PAGES mismatch · 97d0da22
      Wen Congyang authored
      NR_FREE_PAGES will be wrong after offlining pages.  We add/dec
      NR_FREE_PAGES like this now:
      
      1. move all pages in buddy system to MIGRATE_ISOLATE, and dec NR_FREE_PAGES
      
      2. don't add NR_FREE_PAGES when it is freed and the migratetype is
         MIGRATE_ISOLATE
      
      3. dec NR_FREE_PAGES when offlining isolated pages.
      
      4. add NR_FREE_PAGES when undoing isolate pages.
      
      When we come to step 3, all pages are in MIGRATE_ISOLATE list, and
      NR_FREE_PAGES are right.  When we come to step4, all pages are not in
      buddy system, so we don't change NR_FREE_PAGES in this step, but we change
      NR_FREE_PAGES in step3.  So NR_FREE_PAGES is wrong after offlining pages.
      So there is no need to change NR_FREE_PAGES in step3.
      
      This patch also fixs a problem in step2: if the migratetype is
      MIGRATE_ISOLATE, we should not add NR_FRR_PAGES when we remove pages from
      pcppages.
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Jianguo Wu <wujianguo106@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97d0da22
    • Wen Congyang's avatar
      memory-hotplug: auto offline page_cgroup when onlining memory block failed · 7c72eb32
      Wen Congyang authored
      When a memory block is onlined, we will try allocate memory on that node
      to store page_cgroup.  If onlining the memory block failed, we don't
      offline the page cgroup, and we have no chance to offline this page cgroup
      unless the memory block is onlined successfully again.  It will cause that
      we can't hot-remove the memory device on that node, because some memory is
      used to store page cgroup.  If onlining the memory block is failed, there
      is no need to stort page cgroup for this memory.  So auto offline
      page_cgroup when onlining memory block failed.
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7c72eb32
    • Wen Congyang's avatar
      memory-hotplug: update mce_bad_pages when removing the memory · 95a4774d
      Wen Congyang authored
      When we hotremove a memory device, we will free the memory to store struct
      page.  If the page is hwpoisoned page, we should decrease mce_bad_pages.
      
      [akpm@linux-foundation.org: cleanup ifdefs]
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      95a4774d
    • Wen Congyang's avatar
      memory-hotplug: skip HWPoisoned page when offlining pages · b023f468
      Wen Congyang authored
      hwpoisoned may be set when we offline a page by the sysfs interface
      /sys/devices/system/memory/soft_offline_page or
      /sys/devices/system/memory/hard_offline_page. If we don't clear
      this flag when onlining pages, this page can't be freed, and will
      not in free list. So we can't offline these pages again. So we
      should skip such page when offlining pages.
      Signed-off-by: default avatarWen Congyang <wency@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b023f468
    • Yasuaki Ishimatsu's avatar
      memory hotplug: suppress "Device memoryX does not have a release() function" warning · fa7194eb
      Yasuaki Ishimatsu authored
      When calling remove_memory_block(), the function shows following message
      at device_release().
      
      "Device 'memory528' does not have a release() function, it is broken and
      must be fixed."
      
      The reason is memory_block's device struct does not have a release()
      function.
      
      So the patch registers memory_block_release() to the device's release()
      function for suppressing the warning message.  Additionally, the patch
      moves kfree(mem) into the release function since the release function is
      prepared as a means to free a memory_block struct.
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Jiang Liu <liuj97@gmail.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fa7194eb
    • Bob Liu's avatar
      thp: cleanup: introduce mk_huge_pmd() · b3092b3b
      Bob Liu authored
      Introduce mk_huge_pmd() to simplify the code
      Signed-off-by: default avatarBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b3092b3b
    • Bob Liu's avatar
      thp: introduce hugepage_vma_check() · fa475e51
      Bob Liu authored
      Multiple places do the same check.
      Signed-off-by: default avatarBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fa475e51
    • Bob Liu's avatar
      mm: introduce mm_find_pmd() · 6219049a
      Bob Liu authored
      Several place need to find the pmd by(mm_struct, address), so introduce a
      function to simplify it.
      
      [akpm@linux-foundation.org: fix warning]
      Signed-off-by: default avatarBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6219049a
    • Bob Liu's avatar
      thp: clean up __collapse_huge_page_isolate · 344aa35c
      Bob Liu authored
      There are duplicated places using release_pte_pages().
      And release_all_pte_pages() can be removed.
      Signed-off-by: default avatarBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Ni zhan Chen <nizhan.chen@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      344aa35c
    • Kirill A. Shutemov's avatar
      mm: use IS_ENABLED(CONFIG_COMPACTION) instead of COMPACTION_BUILD · d84da3f9
      Kirill A. Shutemov authored
      We don't need custom COMPACTION_BUILD anymore, since we have handy
      IS_ENABLED().
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d84da3f9
    • Kirill A. Shutemov's avatar
      mm: use IS_ENABLED(CONFIG_NUMA) instead of NUMA_BUILD · e5adfffc
      Kirill A. Shutemov authored
      We don't need custom NUMA_BUILD anymore, since we have handy
      IS_ENABLED().
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e5adfffc
    • David Rientjes's avatar
      mm, memcg: make mem_cgroup_out_of_memory() static · 19965460
      David Rientjes authored
      mem_cgroup_out_of_memory() is only referenced from within file scope, so
      it can be marked static.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19965460
    • Rabin Vincent's avatar
      mm: show migration types in show_mem · 377e4f16
      Rabin Vincent authored
      This is useful to diagnose the reason for page allocation failure for
      cases where there appear to be several free pages.
      
      Example, with this alloc_pages(GFP_ATOMIC) failure:
      
       swapper/0: page allocation failure: order:0, mode:0x0
       ...
       Mem-info:
       Normal per-cpu:
       CPU    0: hi:   90, btch:  15 usd:  48
       CPU    1: hi:   90, btch:  15 usd:  21
       active_anon:0 inactive_anon:0 isolated_anon:0
        active_file:0 inactive_file:84 isolated_file:0
        unevictable:0 dirty:0 writeback:0 unstable:0
        free:4026 slab_reclaimable:75 slab_unreclaimable:484
        mapped:0 shmem:0 pagetables:0 bounce:0
       Normal free:16104kB min:2296kB low:2868kB high:3444kB active_anon:0kB
       inactive_anon:0kB active_file:0kB inactive_file:336kB unevictable:0kB
       isolated(anon):0kB isolated(file):0kB present:331776kB mlocked:0kB
       dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:300kB
       slab_unreclaimable:1936kB kernel_stack:328kB pagetables:0kB unstable:0kB
       bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
       lowmem_reserve[]: 0 0
      
      Before the patch, it's hard (for me, at least) to say why all these free
      chunks weren't considered for allocation:
      
       Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB
       1*1024kB 1*2048kB 3*4096kB = 16128kB
      
      After the patch, it's obvious that the reason is that all of these are
      in the MIGRATE_CMA (C) freelist:
      
       Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB (C) 1*512kB
       (C) 1*1024kB (C) 1*2048kB (C) 3*4096kB (C) = 16128kB
      Signed-off-by: default avatarRabin Vincent <rabin.vincent@stericsson.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      377e4f16
    • Namjae Jeon's avatar
      writeback: remove nr_pages_dirtied arg from balance_dirty_pages_ratelimited_nr() · d0e1d66b
      Namjae Jeon authored
      There is no reason to pass the nr_pages_dirtied argument, because
      nr_pages_dirtied value from the caller is unused in
      balance_dirty_pages_ratelimited_nr().
      Signed-off-by: default avatarNamjae Jeon <linkinjeon@gmail.com>
      Signed-off-by: default avatarVivek Trivedi <vtrivedi018@gmail.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0e1d66b
  2. 11 Dec, 2012 16 commits
    • Linus Torvalds's avatar
      Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6 · b58ed041
      Linus Torvalds authored
      Pull device tree changes from Grant Likely:
       "Here are the DT changes I've got queued up for v3.8.  As described
        below, there are a lot of bug fixes here and documentation updates but
        nothing major:
      
        Bug fixes, little cleanups, and documentation changes.  The most
        invasive thing here touches a bunch of the arch directories to use a
        common build rule for .dtb files.  There are no major changes to
        functionality here other than a few new helper functions."
      
      * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6: (34 commits)
        arm64: Fix the dtbs target building
        mtd: nand: davinci: fix the binding documentation
        rtc: rtc-mv: Add the device tree binding documentation
        devicetree/bindings: Move gpio-leds binding into leds directory
        of/vendor-prefixes: add Imagination Technologies
        microblaze: use new common dtc rule
        c6x: use new common dtc rule
        openrisc: use new common dtc rule
        arm64: Add dtbs target for building all the enabled dtb files
        arm64: use new common dtc rule
        ARM: dt: change .dtb build rules to build in dts directory
        kbuild: centralize .dts->.dtb rule
        Fix build when CONFIG_W1_MASTER_GPIO=m b exporting "allnodes"
        of/spi: Honour "status=disabled" property of device
        of_mdio: Honour "status=disabled" property of device
        of_i2c: Honour "status=disabled" property of device
        powerpc: Fix fallout from device_node->name constification
        of: add 'const' for of_parse_phandle parameter *np
        Documentation: correct of_platform_populate() argument list
        script: dtc: clean generated files
        ...
      b58ed041
    • Linus Torvalds's avatar
      Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6 · 259cdbee
      Linus Torvalds authored
      Pull irqdomain changes from Grant Likely:
       "Trivial changes to irqdomain.  An update to the documentation and make
        one of the error paths not quite so obnoxious."
      
      * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6:
        irqdomain: update documentation
        irqdomain: stop screaming about preallocated irqdescs
      259cdbee
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 9ada9fd5
      Linus Torvalds authored
      Pull EDAC fixes from Borislav Petkov:
      
       - EDAC core error path fix, from Denis Kirjanov.
      
       - Generalization of AMD MCE bank names and some minor error reporting
         improvements.
      
       - EDAC core cleanups and simplifications, from Wei Yongjun.
      
       - amd64_edac fixes for sysfs-reported values, from Josh Hunt.
      
       - some heavy amd64_edac error reporting path shaving, leading to
         removing a bunch of code.
      
       - amd64_edac error injection method improvements.
      
       - EDAC core cleanups and fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (24 commits)
        EDAC, pci_sysfs: Use for_each_pci_dev to simplify the code
        EDAC: Handle error path in edac_mc_sysfs_init() properly
        MCE, AMD: Dump error status
        MCE, AMD: Report decoded error type first
        MCE, AMD: Dump CPU f/m/s triple with the error
        MCE, AMD: Remove functional unit references
        EDAC: Convert to use simple_open()
        EDAC, Calxeda highbank: Convert to use simple_open()
        EDAC: Fix mc size reported in sysfs
        EDAC: Fix csrow size reported in sysfs
        EDAC: Pass mci parent
        EDAC: Add memory controller flags
        amd64_edac: Fix csrows size and pages computation
        amd64_edac: Use DBAM_DIMM macro
        amd64_edac: Fix K8 chip select reporting
        amd64_edac: Reorganize error reporting path
        amd64_edac: Do not check whether error address is valid
        amd64_edac: Improve error injection
        amd64_edac: Cleanup error injection code
        amd64_edac: Small fixlets and cleanups
        ...
      9ada9fd5
    • Linus Torvalds's avatar
      Merge branch 'for-v3.8' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · c45564e9
      Linus Torvalds authored
      Pull CMA and DMA-mapping update from Marek Szyprowski:
       "Another set of Contiguous Memory Allocator and DMA-mapping framework
        updates for v3.8.
      
        This pull request consists only of two patches.  The first fixes a
        long standing issue with dmapools (the code predates current GIT
        history), which forced all allocations to use GFP_ATOMIC flag,
        ignoring the flags passed by the caller.  The second patch changes CMA
        code to correctly use phys_addr_t type what enables support for LPAE
        systems."
      
      * 'for-v3.8' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
        drivers: cma: represent physical addresses as phys_addr_t
        mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls
      c45564e9
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux · 93874681
      Linus Torvalds authored
      Pull clock framework changes from Mike Turquette:
       "The common clock framework changes for 3.8 are comprised of lots of
        fixes for existing platforms as well as new ports for some ARM
        platforms.  In addition there are new clk drivers for audio devices
        and MFDs."
      
      Fix up trivial conflict in <linux/clk-provider.h> (removal of 'inline'
      clashing with return type fixes)
      
      * tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux: (51 commits)
        MAINTAINERS: bad email address for Mike Turquette
        clk: introduce optional disable_unused callback
        clk: ux500: fix bit error
        clk: clock multiplexers may register out of order
        clk: ux500: Initial support for abx500 clock driver
        CLK: SPEAr: Remove unused dummy apb_pclk
        CLK: SPEAr: Correct index scanning done for clock synths
        CLK: SPEAr: Update clock rate table
        CLK: SPEAr: Add missing clocks
        CLK: SPEAr: Set CLK_SET_RATE_PARENT for few clocks
        CLK: SPEAr13xx: fix parent names of multiple clocks
        CLK: SPEAr13xx: Fix mux clock names
        CLK: SPEAr: Fix dev_id & con_id for multiple clocks
        clk: move IM-PD1 clocks to drivers/clk
        clk: make ICST driver handle the VCO registers
        clk: add GPLv2 headers to the Versatile clock files
        clk: mxs: Use a better name for the USB PHY clock
        clk: spear: Add stub functions for spear3[0|1|2]0_clk_init()
        CLK: clk-twl6040: fix return value check in twl6040_clk_probe()
        clk: ux500: Register nomadik keypad clock lookups for u8500
        ...
      93874681
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-for-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 505cbeda
      Linus Torvalds authored
      Pull pinctrl changes from Linus Walleij:
       "These are the first and major pinctrl changes for the v3.8 merge
        cycle.  Some of this is used as merge base for other trees so I better
        be early on the trigger.
      
        As can be seen from the diffstat the major changes are:
      
        - A big conversion of the AT91 pinctrl driver and the associated ACKed
          platform changes under arch/arm/max-at91 and its device trees.  This
          has been coordinated with the AT91 maintainers to go in through the
          pinctrl tree.
      
        - A larger chunk of changes to the SPEAr drivers and the addition of
          the "plgpio" driver for the SPEAr as well.
      
        - The removal of the remnants of the Nomadik driver from the arch/arm
          tree and fusion of that into the Nomadik driver and platform data
          header files.
      
        - Some local movement in the Marvell MVEBU drivers, these now have
          their own subdirectory.
      
        - The addition of a chunk of code to gpiolib under drivers/gpio to
          register gpio-to-pin range mappings from the GPIO side of things.
          This has been requested by Grant Likely and is now implemented, it
          is particularly useful for device tree work.
      
        Then we have incremental updates all over the place, many of these are
        cleanups and fixes from Axel Lin who has done a great job of removing
        minor mistakes and compilation annoyances."
      
      * tag 'pinctrl-for-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (114 commits)
        ARM: mmp: select PINCTRL for ARCH_MMP
        pinctrl: Drop selecting PINCONF for MMP2, PXA168 and PXA910
        pinctrl: pinctrl-single: Fix error check condition
        pinctrl: SPEAr: Update error check for unsigned variables
        gpiolib: Fix use after free in gpiochip_add_pin_range
        gpiolib: rename pin range arguments
        pinctrl: single: support gpio request and free
        pinctrl: generic: add input schmitt disable parameter
        pinctrl/u300/coh901: stop spawning pinctrl from GPIO
        pinctrl/u300/coh901: let the gpio_chip register the range
        pinctrl: add function to retrieve range from pin
        gpiolib: return any error code from range creation
        pinctrl: make range registration defer properly
        gpiolib: rename find_pinctrl_*
        gpiolib: let gpiochip_add_pin_range() specify offset
        ARM: at91: pm9g45: add mmc support
        ARM: at91: Animeo IP: add mmc support
        ARM: at91: dt: add mmc pinctrl for Atmel reference boards
        ARM: at91: dt: at91sam9: add mmc pinctrl support
        ARM: at91/dts: add nodes for atmel hsmci controllers for atmel boards
        ...
      505cbeda
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · a8936db7
      Linus Torvalds authored
      Pull hwmon updates from Guenter Roeck:
       "New driver: DA9055
      
        Added/improved support for new chips in existing drivers: Z650/670,
        N550/570, ADS7830, AMD 16h family"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (da9055) Fix chan_mux[DA9055_ADC_ADCIN3] setting
        hwmon: DA9055 HWMON driver
        hwmon: (coretemp) List TjMax for Z650/670 and N550/570
        hwmon: (coretemp) Drop N4xx, N5xx, D4xx, D5xx CPUs from tjmax table
        hwmon: (coretemp) Use model table instead of if/else to identify CPU models
        hwmon: da9052: Use da9052_reg_update for rmw operations
        hwmon: (coretemp) Drop dependency on PCI for TjMax detection on Atom CPUs
        hwmon: (ina2xx) use module_i2c_driver to simplify the code
        hwmon: (ads7828) add support for ADS7830
        hwmon: (ads7828) driver cleanup
        x86,AMD: Power driver support for AMD's family 16h processors
      a8936db7
    • Linus Torvalds's avatar
      Merge tag 'mmc-updates-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 11b84c58
      Linus Torvalds authored
      Pull MMC updates from Chris Ball:
       "MMC highlights for 3.8:
      
        Core:
         - Expose access to the eMMC RPMB ("Replay Protected Memory Block")
           area by extending the existing mmc_block ioctl.
         - Add SDIO powered-suspend DT properties to the core MMC DT binding.
         - Add no-1-8-v DT flag for boards where the SD controller reports
           that it supports 1.8V but the board itself has no way to switch to
           1.8V.
         - More work on switching to 1.8V UHS support using a vqmmc regulator.
         - Fix up a case where the slot-gpio helper may fail to reset the host
           controller properly if a card was removed during a transfer.
         - Fix several cases where a broken device could cause an infinite
           loop while we wait for a register to update.
      
        Drivers:
         - at91-mci: Remove obsolete driver, atmel-mci handles these devices
           now.
         - sdhci-dove: Allow using GPIOs for card-detect notifications.
         - sdhci-esdhc: Fix for recovering from ADMA errors on broken silicon.
         - sdhci-s3c: Add pinctrl support.
         - wmt-sdmmc: New driver for WonderMedia SD/MMC controllers."
      
      * tag 'mmc-updates-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (65 commits)
        mmc: sdhci: implement the .card_event() method
        mmc: extend the slot-gpio card-detection to use host's .card_event() method
        mmc: add a card-event host operation
        mmc: sdhci-s3c: Fix compilation warning
        mmc: sdhci-pci: Enable SDHCI_CAN_DO_HISPD for Ricoh SDHCI controller
        mmc: sdhci-dove: allow GPIOs to be used for card detection on Dove
        mmc: sdhci-dove: use two-stage initialization for sdhci-pltfm
        mmc: sdhci-dove: use devm_clk_get()
        mmc: eSDHC: Recover from ADMA errors
        mmc: dw_mmc: remove duplicated buswidth code
        mmc: dw_mmc: relocate where dw_mci_setup_bus() is called from
        mmc: Limit MMC speed to 52MHz if not HS200
        mmc: dw_mmc: use devres functions in dw_mmc
        mmc: sh_mmcif: remove unneeded clock connection ID
        mmc: sh_mobile_sdhi: remove unneeded clock connection ID
        mmc: sh_mobile_sdhi: fix clock frequency printing
        mmc: Remove redundant null check before kfree in bus.c
        mmc: Remove redundant null check before kfree in sdio_bus.c
        mmc: sdhci-imx-esdhc: use more devm_* functions
        mmc: dt: add no-1-8-v device tree flag
        ...
      11b84c58
    • Vitaly Andrianov's avatar
      drivers: cma: represent physical addresses as phys_addr_t · 4009793e
      Vitaly Andrianov authored
      This commit changes the CMA early initialization code to use phys_addr_t
      for representing physical addresses instead of unsigned long.
      
      Without this change, among other things, dma_declare_contiguous() simply
      discards any memory regions whose address is not representable as unsigned
      long.
      
      This is a problem on 32-bit PAE machines where unsigned long is 32-bit
      but physical address space is larger.
      Signed-off-by: default avatarVitaly Andrianov <vitalya@ti.com>
      Signed-off-by: default avatarCyril Chemparathy <cyril@ti.com>
      Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      4009793e
    • Marek Szyprowski's avatar
      mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls · 387870f2
      Marek Szyprowski authored
      dmapool always calls dma_alloc_coherent() with GFP_ATOMIC flag,
      regardless the flags provided by the caller. This causes excessive
      pruning of emergency memory pools without any good reason. Additionaly,
      on ARM architecture any driver which is using dmapools will sooner or
      later  trigger the following error:
      "ERROR: 256 KiB atomic DMA coherent pool is too small!
      Please increase it with coherent_pool= kernel parameter!".
      Increasing the coherent pool size usually doesn't help much and only
      delays such error, because all GFP_ATOMIC DMA allocations are always
      served from the special, very limited memory pool.
      
      This patch changes the dmapool code to correctly use gfp flags provided
      by the dmapool caller.
      Reported-by: default avatarSoeren Moch <smoch@web.de>
      Reported-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Tested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Tested-by: default avatarSoeren Moch <smoch@web.de>
      Cc: stable@vger.kernel.org
      387870f2
    • Mike Turquette's avatar
    • Mike Turquette's avatar
      clk: introduce optional disable_unused callback · 7c045a55
      Mike Turquette authored
      Some gate clocks have special needs which must be handled during the
      disable-unused clocks sequence.  These needs might be driven by software
      due to the fact that we're disabling a clock outside of the normal
      clk_disable path and a clk's enable_count will not be accurate.  On the
      other hand a specific hardware programming sequence might need to be
      followed for this corner case.
      
      This change is needed for the upcoming OMAP port to the common clock
      framework.  Specifically, it is undesirable to treat the disable-unused
      path identically to the normal clk_disable path since other software
      layers are involved.  In this case OMAP's clockdomain code throws WARNs
      and bails early due to the clock's enable_count being set to zero.  A
      custom callback mitigates this problem nicely.
      
      Cc: Paul Walmsley <paul@pwsan.com>
      Acked-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarMike Turquette <mturquette@linaro.org>
      7c045a55
    • Linus Torvalds's avatar
      Linux 3.7 · 29594404
      Linus Torvalds authored
      29594404
    • Catalin Marinas's avatar
      arm64: Fix the dtbs target building · 58fea354
      Catalin Marinas authored
      The arch/arm64/Makefile was not passing the right target to the
      boot/dts/Makefile.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarRob Herring <rob.herring@calxeda.com>
      58fea354
    • Florian Fainelli's avatar
      Input: matrix-keymap - provide proper module license · 55220bb3
      Florian Fainelli authored
      The matrix-keymap module is currently lacking a proper module license,
      add one so we don't have this module tainting the entire kernel.  This
      issue has been present since commit 1932811f ("Input: matrix-keymap
      - uninline and prepare for device tree support")
      Signed-off-by: default avatarFlorian Fainelli <florian@openwrt.org>
      CC: stable@vger.kernel.org # v3.5+
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      55220bb3
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 2c68bc72
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Netlink socket dumping had several missing verifications and checks.
      
          In particular, address comparisons in the request byte code
          interpreter could access past the end of the address in the
          inet_request_sock.
      
          Also, address family and address prefix lengths were not validated
          properly at all.
      
          This means arbitrary applications can read past the end of certain
          kernel data structures.
      
          Fixes from Neal Cardwell.
      
       2) ip_check_defrag() operates in contexts where we're in the process
          of, or about to, input the packet into the real protocols
          (specifically macvlan and AF_PACKET snooping).
      
          Unfortunately, it does a pskb_may_pull() which can modify the
          backing packet data which is not legal if the SKB is shared.  It
          very much can be shared in this context.
      
          Deal with the possibility that the SKB is segmented by using
          skb_copy_bits().
      
          Fix from Johannes Berg based upon a report by Eric Leblond.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        ipv4: ip_check_defrag must not modify skb before unsharing
        inet_diag: validate port comparison byte code to prevent unsafe reads
        inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run()
        inet_diag: validate byte code to prevent oops in inet_diag_bc_run()
        inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state
      2c68bc72
  3. 10 Dec, 2012 3 commits