1. 16 Jul, 2007 20 commits
    • Nick Piggin's avatar
      slob: rework freelist handling · 95b35127
      Nick Piggin authored
      Improve slob by turning the freelist into a list of pages using struct page
      fields, then each page has a singly linked freelist of slob blocks via a
      pointer in the struct page.
      
      - The first benefit is that the slob freelists can be indexed by a smaller
        type (2 bytes, if the PAGE_SIZE is reasonable).
      
      - Next is that freeing is much quicker because it does not have to traverse
        the entire freelist. Allocation can be slightly faster too, because we can
        skip almost-full freelist pages completely.
      
      - Slob pages are then freed immediately when they become empty, rather than
        having a periodic timer try to free them. This gives efficiency and memory
        consumption improvement.
      
      Then, we don't encode seperate size and next fields into each slob block,
      rather we use the sign bit to distinguish between "size" or "next". Then
      size 1 blocks contain a "next" offset, and others contain the "size" in
      the first unit and "next" in the second unit.
      
      - This allows minimum slob allocation alignment to go from 8 bytes to 2
        bytes on 32-bit and 12 bytes to 2 bytes on 64-bit. In practice, it is
        best to align them to word size, however some architectures (eg. cris)
        could gain space savings from turning off this extra alignment.
      
      Then, make kmalloc use its own slob_block at the front of the allocation
      in order to encode allocation size, rather than rely on not overwriting
      slob's existing header block.
      
      - This reduces kmalloc allocation overhead similarly to alignment reductions.
      
      - Decouples kmalloc layer from the slob allocator.
      
      Then, add a page flag specific to slob pages.
      
      - This means kfree of a page aligned slob block doesn't have to traverse
        the bigblock list.
      
      I would get benchmarks, but my test box's network doesn't come up with
      slob before this patch. I think something is timing out. Anyway, things
      are faster after the patch.
      
      Code size goes up about 1K, however dynamic memory usage _should_ be
      lower even on relatively small memory systems.
      
      Future todo item is to restore the cyclic free list search, rather than
      to always begin at the start.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Acked-by: default avatarMatt Mackall <mpm@selenic.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      95b35127
    • Robert P. J. Day's avatar
      Remove the deprecated "kmem_cache_t" typedef from slab.h. · 698827fa
      Robert P. J. Day authored
      Given that there is no remaining usage of the deprecated kmem_cache_t
      typedef anywhere in the tree, remove that typedef.
      Signed-off-by: default avatarRobert P. J. Day <rpjday@mindspring.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      698827fa
    • Eric Dumazet's avatar
      MM: alloc_large_system_hash() can free some memory for non power-of-two bucketsize · 1037b83b
      Eric Dumazet authored
      alloc_large_system_hash() is called at boot time to allocate space for
      several large hash tables.
      
      Lately, TCP hash table was changed and its bucketsize is not a power-of-two
      anymore.
      
      On most setups, alloc_large_system_hash() allocates one big page (order >
      0) with __get_free_pages(GFP_ATOMIC, order).  This single high_order page
      has a power-of-two size, bigger than the needed size.
      
      We can free all pages that wont be used by the hash table.
      
      On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.
      
      TCP established hash table entries: 32768 (order: 6, 393216 bytes)
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1037b83b
    • Pavel Emelianov's avatar
      Make /proc/slabinfo use seq_list_xxx helpers · b92151ba
      Pavel Emelianov authored
      This entry prints a header in .start callback.  This is OK, but the more
      elegant solution would be to move this into the .show callback and use
      seq_list_start_head() in .start one.
      
      I have left it as is in order to make the patch just switch to new API and
      noting more.
      
      [adobriyan@sw.ru: Wrong pointer was used as kmem_cache pointer]
      Signed-off-by: default avatarPavel Emelianov <xemul@openvz.org>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b92151ba
    • Rolf Eike Beer's avatar
      MM: use DIV_ROUND_UP() in mm/memory.c · 68e116a3
      Rolf Eike Beer authored
      Replace a hand coded version of DIV_ROUND_UP().
      Signed-off-by: default avatarRolf Eike Beer <eike-kernel@sf-tec.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      68e116a3
    • Nishanth Aravamudan's avatar
      hugetlb: remove unnecessary nid initialization · 31a5c6e4
      Nishanth Aravamudan authored
      nid is initialized to numa_node_id() but will either be overwritten in
      the loop or not used in the conditional. So remove the initialization.
      Signed-off-by: default avatarNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      31a5c6e4
    • KAMEZAWA Hiroyuki's avatar
      change zonelist order: zonelist order selection logic · f0c0b2b8
      KAMEZAWA Hiroyuki authored
      Make zonelist creation policy selectable from sysctl/boot option v6.
      
      This patch makes NUMA's zonelist (of pgdat) order selectable.
      Available order are Default(automatic)/ Node-based / Zone-based.
      
      [Default Order]
      The kernel selects Node-based or Zone-based order automatically.
      
      [Node-based Order]
      This policy treats the locality of memory as the most important parameter.
      Zonelist order is created by each zone's locality. This means lower zones
      (ex. ZONE_DMA) can be used before higher zone (ex. ZONE_NORMAL) exhausion.
      IOW. ZONE_DMA will be in the middle of zonelist.
      current 2.6.21 kernel uses this.
      
      Pros.
       * A user can expect local memory as much as possible.
      Cons.
       * lower zone will be exhansted before higher zone. This may cause OOM_KILL.
      
      Maybe suitable if ZONE_DMA is relatively big and you never see OOM_KILL
      because of ZONE_DMA exhaution and you need the best locality.
      
      (example)
      assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL.
      
      *node(0)'s memory allocation order:
      
       node(0)'s NORMAL -> node(0)'s DMA -> node(1)'s NORMAL.
      
      *node(1)'s memory allocation order:
      
       node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA.
      
      [Zone-based order]
      This policy treats the zone type as the most important parameter.
      Zonelist order is created by zone-type order. This means lower zone
      never be used bofere higher zone exhaustion.
      IOW. ZONE_DMA will be always at the tail of zonelist.
      
      Pros.
       * OOM_KILL(bacause of lower zone) occurs only if the whole zones are exhausted.
      Cons.
       * memory locality may not be best.
      
      (example)
      assume 2 node NUMA. node(0) has ZONE_DMA/ZONE_NORMAL, node(1) has ZONE_NORMAL.
      
      *node(0)'s memory allocation order:
      
       node(0)'s NORMAL -> node(1)'s NORMAL -> node(0)'s DMA.
      
      *node(1)'s memory allocation order:
      
       node(1)'s NORMAL -> node(0)'s NORMAL -> node(0)'s DMA.
      
      bootoption "numa_zonelist_order=" and proc/sysctl is supporetd.
      
      command:
      %echo N > /proc/sys/vm/numa_zonelist_order
      
      Will rebuild zonelist in Node-based order.
      
      command:
      %echo Z > /proc/sys/vm/numa_zonelist_order
      
      Will rebuild zonelist in Zone-based order.
      
      Thanks to Lee Schermerhorn, he gives me much help and codes.
      
      [Lee.Schermerhorn@hp.com: add check_highest_zone to build_zonelists_in_zone_order]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: "jesse.barnes@intel.com" <jesse.barnes@intel.com>
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0c0b2b8
    • Yinghai Lu's avatar
      serial: convert early_uart to earlycon for 8250 · 18a8bd94
      Yinghai Lu authored
      Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
      include/asm-x86_64/serial.h.  the serial8250_ports need to be probed late in
      serial initializing stage.  the console_init=>serial8250_console_init=>
      register_console=>serial8250_console_setup will return -ENDEV, and console
      ttyS0 can not be enabled at that time.  need to wait till uart_add_one_port in
      drivers/serial/serial_core.c to call register_console to get console ttyS0.
      that is too late.
      
      Make early_uart to use early_param, so uart console can be used earlier.  Make
      it to be bootconsole with CON_BOOT flag, so can use console handover feature.
      and it will switch to corresponding normal serial console automatically.
      
      new command line will be:
      	console=uart8250,io,0x3f8,9600n8
      	console=uart8250,mmio,0xff5e0000,115200n8
      or
      	earlycon=uart8250,io,0x3f8,9600n8
      	earlycon=uart8250,mmio,0xff5e0000,115200n8
      
      it will print in very early stage:
      	Early serial console at I/O port 0x3f8 (options '9600n8')
      	console [uart0] enabled
      later for console it will print:
      	console handover: boot [uart0] -> real [ttyS0]
      
      Signed-off-by: <yinghai.lu@sun.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18a8bd94
    • Eric W. Biderman's avatar
      x86: initial fixmap support · b1c931e3
      Eric W. Biderman authored
      Needed to get fixed virtual address for USB debug and earlycon with mmio.
      Signed-off-by: default avatarEric W. Biderman <ebiderman@xmisson.com>
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@sun.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1c931e3
    • Yinghai Lu's avatar
      console: console handover to preferred console · d37bf60d
      Yinghai Lu authored
      for earlyprintk=ttyS0,9600 console=tty0 console=ttyS0,9600n8
      
      the handover will happen from earlyser0 to tty0.  but what we want is to
      hand over to ttyS0.
      
      Later with serial-convert-early_uart-to-earlycon-for-8250.patch,
      
      	console=tty0 console=uart8250,io,0x3f8,9600n8
      
      will handover to ttyS0 instead of tty0.
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@sun.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d37bf60d
    • Yinghai Lu's avatar
      console: more buf for index parsing · eaa944af
      Yinghai Lu authored
      Change name to buf according to the usage as name + index
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@sun.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eaa944af
    • Yinghai Lu's avatar
      serial: assert DTR for serial console devices · 79492689
      Yinghai Lu authored
      Some RS-232 devices require DTR to be asserted before they can be used.  DTR
      is normally asserted in uart_startup() when the port is opened.  But we don't
      actually open serial console ports, so assert DTR when the port is added.
      
      BTW:
      earlyprintk and early_uart are hard coded to set DTR/RTS.
      
      rmk says
      
        The only issue I can think of is the possibility for an attached modem to
        auto-answer or maybe even auto-dial before the system is ready for it to do
        so.  Might have an undesirable cost implication for some running with such a
        setup.
      
        Apart from that, I can't think of any other side effect of this specific
        patch.
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@sun.com>
      Acked-by: default avatarRussell King <rmk@arm.linux.org.uk>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79492689
    • Kristian Hoegsberg's avatar
      lib: add idr_remove_all · 23936cc0
      Kristian Hoegsberg authored
      Remove all ids from the given idr tree.  idr_destroy() only frees up
      unused, cached idp_layers, but this function will remove all id mappings
      and leave all idp_layers unused.
      
      A typical clean-up sequence for objects stored in an idr tree, will use
      idr_for_each() to free all objects, if necessay, then idr_remove_all() to
      remove all ids, and idr_destroy() to free up the cached idr_layers.
      Signed-off-by: default avatarKristian Hoegsberg <krh@redhat.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      23936cc0
    • Kristian Hoegsberg's avatar
      lib: add idr_for_each() · 96d7fa42
      Kristian Hoegsberg authored
      This patch adds an iterator function for the idr data structure.  Compared
      to just iterating through the idr with an integer and idr_find, this
      iterator is (almost, but not quite) linear in the number of elements, as
      opposed to the number of integers in the range covered by the idr.  This
      makes a difference for sparse idrs, but more importantly, it's a nicer way
      to iterate through the elements.
      
      The drm subsystem is moving to idr for tracking contexts and drawables, and
      with this change, we can use the idr exclusively for tracking these
      resources.
      
      [akpm@linux-foundation.org: fix comment]
      Signed-off-by: default avatarKristian Hoegsberg <krh@redhat.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96d7fa42
    • Andy Whitcroft's avatar
      update checkpatch.pl to version 0.07 · de7d4f0e
      Andy Whitcroft authored
      This version brings a number of new checks, fixes for flase
      positives, plus a clarification of the output to better guide use.  Of
      note:
      
        - checks for documentation for new __setup calls
        - clearer reporting where braces and parenthesis are involved
        - reports for closing brace and semi-colon spacing
        - reports on unwanted externs
      
      This patch includes an update to the documentation on checkpatch.pl
      itself to clarify when it should be used and to indicate that it
      is not intended as the final arbitor of style.
      
      Full changelog:
      
      Andy Whitcroft (19):
            Version: 0.07
            ensure we do not apply control brace checks to preprocesor directives
            add {u,s}{8,16,32,64} to the type matcher
            accept lack of spacing after the semicolons in for (;;)
            report new externs in .c files
            fix up typedef exclusion for function prototypes
            else trailing statements check need to account for \ at end of line
            add enums to the type matcher
            add missing check descriptions
            suppress double reporting of ** spacing
            report on do{ spacing issues
            include an example of the brace/parenthesis in output
            check for spacing after closing braces
            prevent double reports on pointer spacing issues
            handle blank continuation lines on macros
            classify all reports error, warning, or check
            revamp hanging { checks and apply in context
            no spaces after the last ; in a for is ok
            check __setup has a corresponding addition to documentation
      
      David Woodhouse (1):
            limit character set used in patches and descriptions to UTF-8
      Signed-off-by: default avatarAndy Whitcroft <apw@shadowen.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de7d4f0e
    • Nitin Gupta's avatar
      LZO1X: fix lzo1x_worst_compress · f2a11b15
      Nitin Gupta authored
      This is a correction for a macro which gives worst case compressed data
      size by LZO1X.
      
      This patch was provided by the LZO author (Markus Oberhumer).
      Signed-off-by: default avatarNitin Gupta <nitingupta910@gmail.com>
      Cc: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
      Cc: "Richard Purdie" <rpurdie@openedhand.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f2a11b15
    • Nelson, Shannon's avatar
      Add entries to MAINTAINERS for I/OAT and DMAENGINE · 248a9dc3
      Nelson, Shannon authored
      Add entries to MAINTAINERS for I/OAT and DMAENGINE
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@intel.com>
      Cc: Chris Leech <christopher.leech@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      248a9dc3
    • Jan Kara's avatar
      jbd2 commit: fix transaction dropping · f89b7795
      Jan Kara authored
      We have to check that also the second checkpoint list is non-empty before
      dropping the transaction.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f89b7795
    • Jan Kara's avatar
      jbd commit: fix transaction dropping · fe28e42b
      Jan Kara authored
      We have to check that also the second checkpoint list is non-empty before
      dropping the transaction.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fe28e42b
    • Andrew Morton's avatar
      authgss build fix · 09561f44
      Andrew Morton authored
      Recent breakage..
      
      net/sunrpc/auth_gss/auth_gss.c:1002: warning: implicit declaration of function 'lock_kernel'
      net/sunrpc/auth_gss/auth_gss.c:1004: warning: implicit declaration of function 'unlock_kernel'
      
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09561f44
  2. 15 Jul, 2007 20 commits