1. 12 Apr, 2004 40 commits
    • Andrew Morton's avatar
      [PATCH] jbd: b_transaction zeroing cleanup · a1fa32d7
      Andrew Morton authored
      Almost everywhere where JBD removes a buffer from the transaction lists the
      caller then nulls out jh->b_transaction.  Sometimes, the caller does that
      without holding the locks which are defined to protect b_transaction.  This
      makes me queazy.
      
      So change things so that __journal_unfile_buffer() nulls out b_transaction
      inside both j_list_lock and jbd_lock_bh_state().
      
      It cleans things up a bit, too.
      a1fa32d7
    • Andrew Morton's avatar
      [PATCH] jbd: do_get_write_access lock contention reduction · cd5f8bb0
      Andrew Morton authored
      We're seeing heavy contention against j_list_lock on 8-way in
      do_get_write_access().
      
      We actually don't need j_list_lock in there except for one little case - the
      per-bh jbd_lock_bh_state() is sufficient to protect this buffer's internal
      state.
      
      On some nice quick LVM array Ram Pai measured an overall 3x speedup from this
      patch:
      
      the script took the following time on 265mm1
       real    0m57.504s
       user    0m0.400s
       sys     7m29.867s
      
      
       and with the 2patches it took
       real 	0m19.983s
       user    0m0.438s
       sys     1m55.896s
      cd5f8bb0
    • Andrew Morton's avatar
      [PATCH] Feed floppy.c through Lindent · 073c11c2
      Andrew Morton authored
      From: "Randy.Dunlap" <rddunlap@osdl.org>
      073c11c2
    • Andrew Morton's avatar
      [PATCH] dnotify_parent speedup · 0079e33e
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      Directory notify code was showing up in a dd bs=1024k from 2 raid arrays
      on an emulex FC adapter:
      
      3635     69.4896  vmlinux-2.6.5            .default_idle
      332       6.3468  vmlinux-2.6.5            .__copy_tofrom_user
      112       2.1411  vmlinux-2.6.5            .save_remaining_regs
      76        1.4529  vmlinux-2.6.5            .scsi_dispatch_cmd
      64        1.2235  vmlinux-2.6.5            .dnotify_parent
      61        1.1661  vmlinux-2.6.5            .do_generic_mapping_read
      
      We already have a sysctl to enable/disable it, the patch below uses it
      in dnotify_parent. dnotify_parent disappears and idle time goes up:
      
      4508     70.8582  vmlinux-2.6.5            .default_idle
      253       3.9767  vmlinux-2.6.5            .__copy_tofrom_user
      142       2.2320  vmlinux-2.6.5            .save_remaining_regs
      88        1.3832  vmlinux-2.6.5            .shrink_zone
      84        1.3203  vmlinux-2.6.5            .elx_drvr_unlock
      75        1.1789  vmlinux-2.6.5            .scsi_dispatch_cmd
      69        1.0846  vmlinux-2.6.5            .do_generic_mapping_read
      
      Of course, to gain this small speedup isers need to know to set
      /proc/sys/fs/dir-notify-enable to zero.  Nobody does that.
      0079e33e
    • Andrew Morton's avatar
      [PATCH] cyclades works OK on SMP · 2c1dcf6c
      Andrew Morton authored
      From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
      
      The cyclades.c driver was marked BROKEN_ON_SMP during early 2.6.  It was
      fixed later on but the tag was left in Kconfig.
      
      The driver is not very smart wrt SMP locking, it can be improved.  There is
      only one spinlock per card which guarantees command block ordering and
      protects different shared data, which can be held for long periods.
      
      _But_ the locking works reliably, so remove the BROKEN_ON_SMP tag.
      2c1dcf6c
    • Andrew Morton's avatar
      [PATCH] rename page_to_nodenum() · 66fb1123
      Andrew Morton authored
      From: "Martin J. Bligh" <mbligh@aracnet.com>
      
      I'd prefer we renamed this to page_to_nid() before anyone starts using it. 
      This fits with the naming convention of everything else (pfn_to_nid, etc). 
      Nobody uses it right now - I grepped the whole tree.
      66fb1123
    • Andrew Morton's avatar
      [PATCH] rmap 3 arches + mapping_mapped · fbf7adfa
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      Some arches refer to page->mapping for their dcache flushing: use
      page_mapping(page) for safety, to avoid confusion on anon pages, which will
      store a different pointer there - though in most cases flush_dcache_page is
      being applied to pagecache pages.
      
      arm has a useful mapping_mapped macro: move that to generic, and add
      mapping_writably_mapped, to avoid explicit list_empty checks on i_mmap and
      i_mmap_shared in several places.
      
      Very tempted to add page_mapped(page) tests, perhaps along with the
      mapping_writably_mapped tests in do_generic_mapping_read and
      do_shmem_file_read, to cut down on wasted flush_dcache effort; but the
      serialization is not obvious, too unsafe to do in a hurry.
      fbf7adfa
    • Andrew Morton's avatar
      [PATCH] rw_swap_page_sync fixes · da47ca23
      Andrew Morton authored
      Fix up the rw_swap_page_sync() gorrors by fully decoupling this function
      from the VM - it is now just a helper function which reads a page from or
      writes a page to swap.
      da47ca23
    • Andrew Morton's avatar
      [PATCH] rmap 2 anon and swapcache · 4875a601
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      Tracking anonymous pages by anon_vma,pgoff or mm,address needs a
      pointer,offset pair in struct page: mapping,index the natural choice.  But
      swapcache uses those for &swapper_space,swp_entry_t.
      
      It's trivial to separate swapcache from pagecache with radix tree; most of
      swapper_space is actually unused, just a fiction to pretend swap like file;
      and page->private is a good place to keep swp_entry_t, now that swap never
      uses bufferheads.
      
      Define PG_anon bit, page_add_rmap SetPageAnon and put an oopsable address in
      page->mapping to test that we're not confused by it.  Define
      page_mapping(page) macro to give NULL when PageAnon, whatever may be in
      page->mapping.  Define PG_swapcache bit, deduce swapper_space from that in
      the few places we need it.
      
      add_to_swap_cache now distinct from add_to_page_cache.  Separating the caches
      somewhat simplifies the tmpfs swizzling in swap_state.c, now the page can
      briefly be in both caches.
      
      The rmap method remains pte chains, no change to that yet.  But one small
      functional difference: the use of PageAnon implies that a page truncated
      while still mapped will no longer be found and freed (swapped out) by
      try_to_unmap, will only be freed by exit or munmap.  But normally pages are
      unmapped by vmtruncate: this should only affect nonlinear mappings, and a
      later patch not in this batch will fix that.
      4875a601
    • Andrew Morton's avatar
      [PATCH] rmap 1 linux/rmap.h · 4c4acd24
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      First of a batch of three rmap patches: this initial batch of three paving
      the way for a move to some form of object-based rmap (probably Andrea's, but
      drawing from mine too), and making almost no functional change by itself.  A
      few days will intervene before the next batch, to give the struct page
      changes in the second patch some exposure before proceeding.
      
      rmap 1 create include/linux/rmap.h
      
      Start small: linux/rmap-locking.h has already gathered some declarations
      unrelated to locking, and the rest of the rmap declarations were over in
      linux/swap.h: gather them all together in linux/rmap.h, and rename the
      pte_chain_lock to rmap_lock.
      4c4acd24
    • Andrew Morton's avatar
      [PATCH] CFQ io scheduler · 3e2ea65d
      Andrew Morton authored
      From: Jens Axboe <axboe@suse.de>
      
      CFQ I/O scheduler
      3e2ea65d
    • Andrew Morton's avatar
      [PATCH] Correct unplugs on nr_queued · 1dc841ed
      Andrew Morton authored
      From: Jens Axboe <axboe@suse.de>
      
      There's a small discrepancy in when we decide to unplug a queue based on
      q->unplug_thresh.  Basically it doesn't work for tagged queues, since
      q->rq.count[READ] + q->rq.count[WRITE] is just the number of allocated
      requests, not the number of requests stuck in the io scheduler.  We could
      just change the nr_queued == to a nr_queued >=, however that is still
      suboptimal.
      
      This patch adds accounting for requests that have been dequeued from the io
      scheduler, but not freed yet.  These are q->in_flight.  allocated_requests
      - q->in_flight == requests_in_scheduler.  So the condition correctly
      becomes
      
      	if (requests_in_scheduler == q->unplug_thresh)
      
      instead.  I did a quick round of testing, and for dbench on a SCSI disk the
      number of timer induced unplugs was reduced from 13 to 5 :-).  Not a huge
      number, but there might be cases where it's more significant.  Either way,
      it gets ->unplug_thresh always right, which the old logic didn't.
      1dc841ed
    • Andrew Morton's avatar
      [PATCH] unplugging: md update · 66db15b4
      Andrew Morton authored
      From: Neil Brown <neilb@cse.unsw.edu.au>
      
      I've made a bunch of changes to the 'md' bits - largely moving the
      unplugging into the individual personalities which know more about which
      drives are actually in use.
      66db15b4
    • Andrew Morton's avatar
      [PATCH] Use BIO_RW_SYNC in swap write page · b1c72a96
      Andrew Morton authored
      From: Jens Axboe <axboe@suse.de>
      
      Dog slow software suspend found this one. If WB_SYNC_ALL, then you need
      to mark the bio as sync as well.
      
      This is because swap_writepage() does a remove_exclusive_swap_page() (going
      to __delete_from_swap_cache -> __remove_from_page_cache) which can kill
      page->mapping, thus aops->sync_page() has nothing to work with for unplugging
      the address space.
      b1c72a96
    • Andrew Morton's avatar
      [PATCH] per-backing dev unplugging · 6d27f67b
      Andrew Morton authored
      From: Jens Axboe <axboe@suse.de>,
            Chris Mason,
            me, others.
      
      The global unplug list causes horrid spinlock contention on many-disk
      many-CPU setups - throughput is worse than halved.
      
      The other problem with the global unplugging is of course that it will cause
      the unplugging of queues which are unrelated to the I/O upon which the caller
      is about to wait.
      
      So what we do to solve these problems is to remove the global unplug and set
      up the infrastructure under which the VFS can tell the block layer to unplug
      only those queues which are relevant to the page or buffer_head whcih is
      about to be waited upon.
      
      We do this via the very appropriate address_space->backing_dev_info structure.
      
      Most of the complexity is in devicemapper, MD and swapper_space, because for
      these backing devices, multiple queues may need to be unplugged to complete a
      page/buffer I/O.  In each case we ensure that data structures are in place to
      permit us to identify all the lower-level queues which contribute to the
      higher-level backing_dev_info.  Each contributing queue is told to unplug in
      response to a higher-level unplug.
      
      To simplify things in various places we also introduce the concept of a
      "synchronous BIO": it is tagged with BIO_RW_SYNC.  The block layer will
      perform an immediate unplug when it sees one of these go past.
      6d27f67b
    • Andrew Morton's avatar
      [PATCH] dmL remove __dm_request · 3749bf2c
      Andrew Morton authored
      From: Joe Thornber <thornber@redhat.com>
      
      dm.c: remove __dm_request (merge with previous patch).
      3749bf2c
    • Andrew Morton's avatar
      [PATCH] Implement queue congestion callout for device mapper · 1fe10e2f
      Andrew Morton authored
      From: Miquel van Smoorenburg <miquels@cistron.nl>
            Joe Thornber <thornber@redhat.com>
      
      This implements the queue congestion callout for DM stacks.  To make
      bdi_read/write_congested() return correct information.
      
      - md->lock protects all fields in md _except_ md->map
      - md->map_lock protects md->map
      - Anyone who wants to read md->map should use dm_get_table() which
        increments the tables reference count.
      
      This means the spin lock is now only held for the duration of a
      reference count increment.
      
      Udpate:
      
      dm.c: protect md->map with a rw spin lock rather than the md->lock
      semaphore.  Also ensure that everyone accesses md->map through
      dm_get_table(), rather than directly.
      1fe10e2f
    • Andrew Morton's avatar
      [PATCH] Add queue congestion callout · 6a435d69
      Andrew Morton authored
      From: Miquel van Smoorenburg <miquels@cistron.nl>
      
      The VM and VFS use the address_space_backing_dev_info to track the realtime
      status of the device which backs the mapping.  The read_congested and
      write_congested fields are used to determine whether a read or write
      against that device may block.
      
      We use this infrastructure to
      
      a) allow pdflush to service many queues in parallel (by not getting
         stuck on any particular one) and
      
      b) to avoid undesirable and uncontrolled latencies in places such as
         page reclaim and
      
      c) To avoid blocking in readahead operations
      
      The current code only supports simple disk queues (and I have a patch here
      for NFS).  Stacked queues (MD and DM) don't get this information right and
      problems were expected.  Efficiency problems have now been noted and it's
      time to fix it.
      
      This patch lays down the infrastructure which permits the queue
      implementation to get control when someone at a higher level is querying
      the queue's congestion state.  So DM (for example) can run around and
      examine all the queues which contribute to the higher-level queue.
      
      
      It also adds bdi_rw_congested() for code in xfs and ext2 that calls both
      bdi_read_congested() and bdi_write_congested() in a row, and it was "free"
      anyway.
      6a435d69
    • Andrew Morton's avatar
      [PATCH] s390: rewritten qeth driver · fa7bb531
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      The rewritten qeth network driver.
      fa7bb531
    • Andrew Morton's avatar
      [PATCH] s390: crypto device driver part 2 · a1171283
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      The crypto device driver for PCICA & PCICC cards, part 2.
      a1171283
    • Andrew Morton's avatar
      [PATCH] s390: crypto device driver part 1 · 58ebaaf0
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      The crypto device driver for PCICA & PCICC cards, part 1.
      58ebaaf0
    • Andrew Morton's avatar
      [PATCH] s390: zfcp log messages part 2 · 57f8dc81
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      zfcp host adapter log message cleanup part 2:
       - Shorten log output.
       - Increase log level for some messages.
       - Always print leading zeroes for wwpn and fcp-lun.
      57f8dc81
    • Andrew Morton's avatar
      [PATCH] s390: zfcp log messages part 1 · 12c845ae
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      zfcp host adapter log message cleanup part 1:
       - Shorten log output.
       - Increase log level for some messages.
       - Always print leading zeroes for wwpn and fcp-lun.
      12c845ae
    • Andrew Morton's avatar
      [PATCH] s390: zfcp fixes (without kfree hack) · f9a56f8a
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      zfcp host adapter fixes:
       - Reuse freed scsi_ids and scsi_luns for mappings.
       - Order list of ports/units by assigned scsi_id/scsi_lun.
       - Don't update max_id/max_lun in scsi_host anymore.
       - Get rid of all magics.
       - Add owner field to ccw_driver structure.
       - Avoid deadlock on bus->subsys.rwsem.
       - Use a macro for all scsi device sysfs attributes.
       - Change proc_name from "dummy" to "zfcp".
       - Don't wait for scsi_add_device to complete while holding a semaphore.
       - Cleanup include files in zfcp_aux.c & zfcp_def.h.
       - Get rid of zfcp_erp_fsf_req_handler.
       - Proper link up/down handling.
       - Avoid possible NULL pointer dereference in zfcp_erp_schedule_work.
       - Remove module_exit function. Without an external release function for
         the zfcp_port/zfcp_unit objects module unloading is racy.
      f9a56f8a
    • Andrew Morton's avatar
      [PATCH] s390: dcss block driver fix · d959cc9f
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      DCSS block device driver changes:
       - Fix remove_store function, put_device is called too early.
      d959cc9f
    • Andrew Morton's avatar
      [PATCH] s390: network driver fixes · f86f3b68
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      Network driver changes:
       - ctc: move kfree of driver structure after the last use of it.
       - netiucv: stay in state startwait if peer is down.
       - lcs: initialize ipm_list and unregister netdev only if it is present.
      f86f3b68
    • Andrew Morton's avatar
      [PATCH] s390: dasd driver fix · 482ac593
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      dasd driver changes:
       - Fix check for device type in error recovery for fba devices.
      482ac593
    • Andrew Morton's avatar
      [PATCH] s390: tape driver fixes · 2152527f
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      Tape driver changes:
       - Add missing break in tape_34xx_work_handler to avoid misleading message.
       - Cleanup offline/remove code.
      2152527f
    • Andrew Morton's avatar
      [PATCH] s390: common i/o layer · 6a562864
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      Common i/o layer changes:
       - Avoid de-registering a ccwgroup device multiple times.
       - Remove check for channel path objects in get_subchannel_by_schid.
         Channel patch objects are never in the bus list.
       - Avoid NULL pointer deref. in qdio_unmark_q.
       - Fix reference counting on subchannel objects.
       - Add shutdown function to terminate i/o and disable subchannels at reipl.
       - Remove all ccwgroup devices if the ccwgroup driver is unregistered.
      6a562864
    • Andrew Morton's avatar
      [PATCH] s390: core s390 · 74216ef5
      Andrew Morton authored
      From: Martin Schwidefsky <schwidefsky@de.ibm.com>
      
      s390 core changes:
       - Fix _raw_spin_trylock for 64 bit.
       - Add clarification to s390 debug debug documentation.
      74216ef5
    • Andrew Morton's avatar
      [PATCH] hugetlb consolidation · c8b976af
      Andrew Morton authored
      From: William Lee Irwin III <wli@holomorphy.com>
      
      The following patch consolidates redundant code in various hugetlb
      implementations.  I took the liberty of renaming a few things, since the
      code was all moved anyway, and it has the benefit of helping to catch
      missed conversions and/or consolidations.
      c8b976af
    • Andrew Morton's avatar
      [PATCH] missing \n in timer_tsc.c · 618e7f44
      Andrew Morton authored
      From: Arjan van de Ven <arjanv@redhat.com>
      
      patch below fixes a missing \n in a printk; without this you get to see a
      <4> in the middle of that line...
      618e7f44
    • Andrew Morton's avatar
      [PATCH] 68knommu: add support for 64MHz clock for ColdFire boards · 3549c624
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      Add support for boards that have a 64MHz clock to common Coldfire header.
      3549c624
    • Andrew Morton's avatar
      [PATCH] 68knommu: 68EZ328/ucdimm setup code printk cleanup · 97773298
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      Add type specifier to printk calls in 68EZ328/ucdimm setup code.  Patch
      original from kernel janitors.
      97773298
    • Andrew Morton's avatar
      [PATCH] 68knommu: cleanup startup code for 68EZ328 DragonEngine board · aa19aafa
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      Clean up debug trace in startup code of 68EZ328 DragonEngine board.
      aa19aafa
    • Andrew Morton's avatar
      [PATCH] 68knommu: mk68knommu DragonEngine setup code printk cleanup · cd18c683
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      A couple of fixes for the DragonEngine sepcific setup code:
      
      . remove cs8900 ethernet setup from here
      . add type specifier to printk calls (from kernel janitors)
      cd18c683
    • Andrew Morton's avatar
      [PATCH] 68knommu: cleanup Motorola 68360 ints code · 6086d4fe
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      Some fixes for the 68360 common ints management code:
      
      . use irqreturn_t for return type of interrupt handlers
      . add type field to printk calls (from kernel janitors)
      . there is no loop in show_interrupts(), don't use continue
      6086d4fe
    • Andrew Morton's avatar
      [PATCH] 68knommu: cleanup Motorola 68328 ints code · 66b80103
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      Some fixes for the 68328 common ints management code:
      
      . use irqreturn_t for return type of interrupt handlers
      . clean up asm code to be gcc-3.3.x clean
      . add type field to printk calls (from kernel janitors)
      . there is no loop in show_interrupts(), don't use continue
      66b80103
    • Andrew Morton's avatar
      [PATCH] 68knommu: use irqreturn_t in Motorola 68328 setup code · c4cc53e3
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      A number of small fixes for the Motorola 68328 setup code:
      
      . fix interrupt routine return types to be irqreturn_t
      . add type specifier to printk calls (from kernel janitors)
      . rework asm code to be gcc-3.3.x clean
      c4cc53e3
    • Andrew Morton's avatar
      [PATCH] 68knommu: use irqreturn_t in ColdFire 5407 setup code · 1bb13b4c
      Andrew Morton authored
      From: <gerg@snapgear.com>
      
      Fixes to the Motorola ColdFire 5407 setup code:
      
      . fix interrupt routine return types to be irqreturn_t
      . add DMA base addresses array
      . support compile time setting of kernel boot arguments
      1bb13b4c