1. 05 Jul, 2002 8 commits
  2. 04 Jul, 2002 32 commits
    • Linus Torvalds's avatar
      Merge home.transmeta.com:/home/torvalds/v2.5/viro · 75eead62
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      75eead62
    • Alexander Viro's avatar
      [PATCH] ->i_dev switched to dev_t · 88cc0d3e
      Alexander Viro authored
      	* ->i_dev followed the example of ->s_dev - it's dev_t now.  All
      remaining uses of ->i_dev either outright want dev_t (stat()) or couldn't
      care less (printing major:minor in /proc/<pid>/maps, etc.)
      88cc0d3e
    • Alexander Viro's avatar
      [PATCH] assorted kdev_t cleanups in filesystems · c9add9b8
      Alexander Viro authored
      	* JFS uses its ->logdev only twice - one of the places assigns
      it to_kdev_t(le32_to_cpu(...)), another uses kdev_t_to_nr() of it.
      Switched to u32 - it's just a place where we store device number we'd got
      from superblock.
      	* several reiserfs_fs.h function prototypes removed - functions
      in question don't exist anymore.
      	* smbfs doesn't support device nodes; ->f_rdev removed.
      c9add9b8
    • Alexander Viro's avatar
      [PATCH] ex_dev switched to dev_t · ab6a5810
      Alexander Viro authored
      	* svc_export ->ex_dev turned into dev_t.  It's a pure search
      key and all places that set it actually do to_kdev_t(some_dev_t_expression).
      ab6a5810
    • Alexander Viro's avatar
      [PATCH] raid kdev_t cleanups - part 3 · dc5d0e46
      Alexander Viro authored
      	* ->dev killed for md/linear.c (same as previous parts)
      dc5d0e46
    • Alexander Viro's avatar
      [PATCH] md_import_device() cleanup · b60f0c2b
      Alexander Viro authored
      	* md_import_device() returns resulting rdev or ERR_PTR(error)
      instead of returning 0 or error an letting caller find rdev.
      b60f0c2b
    • Alexander Viro's avatar
      [PATCH] raid kdev_t cleanups - part 2 · 881c3bc1
      Alexander Viro authored
      	* a bunch of callers of partition_name() are calling
      bdev_partition_name(),
      	* the last users of raid1 and multipath ->dev are gone; so are
      the fields in question.
      881c3bc1
    • Alexander Viro's avatar
      [PATCH] raid ->diskop() splitup · f3ddcd6b
      Alexander Viro authored
      	* ->diskop() split into individual methods; prototypes cleaned
      up.  In particular, handling of hot_add_disk() gets mdk_rdev_t * of
      the component we are adding as an argument instead of playing the games
      with major/minor.  Code cleaned up.
      f3ddcd6b
    • Alexander Viro's avatar
      [PATCH] raid kdev_t cleanups (part 1) · 480f4106
      Alexander Viro authored
      	* ->error_handler() switched to struct block_device *.
      	* md_sync_acct() switched to struct block_device *.
      	* raid5 struct disk_info ->dev is gone - we use ->bdev everywhere.
      	* bunch of kdev_same() when we have corresponding struct block_device *
      and can simply compare them is removed from drivers/md/*.c
      480f4106
    • Alexander Viro's avatar
      [PATCH] kdev_t crapectomy · a99f1593
      Alexander Viro authored
      	* since the last caller of is_read_only() is gone, the function
      itself is removed.
      	* destroy_buffers() is not used anymore; gone.
      	* fsync_dev() is gone; the only user is (broken) lvm.c and first
      step in fixing lvm.c will consist of propagating struct block_device *
      anyway; at that point we'll just use fsync_bdev() in there.
      	* prototype of bio_ioctl() removed - function doesn't exist
      anymore.
      a99f1593
    • Alexander Viro's avatar
      [PATCH] cdrom.c cleanups · 67addbac
      Alexander Viro authored
      	* Bunch of functions in cdrom.c used to get kdev_t and use it
      only to do cdrom_find_device(dev), even though their callers already
      had struct cdrom_device_info * in question.  Switched to passing
      said pointer directly.
      	* useless exports removed; stuff not used outside of cdrom.c
      made static.
      67addbac
    • Alexander Viro's avatar
      [PATCH] (md.c) block device size cleanups · 123caef2
      Alexander Viro authored
      	* calc_dev_sboffset() and calc_dev_size() in md.c are getting
      mk_rdev_t instead of kdev_t.  Callers updated.
      	* calls of blkdev_size_in_bytes() in md.c replaced with use
      of rdev->bdev->bd_inode->i_size.
      123caef2
    • Alexander Viro's avatar
      [PATCH] devpts cleanup · 2aa85937
      Alexander Viro authored
      	* devpts "upcalls" eliminated.
      	* instead of playing games with revalidation we simply use
      ramfs-style tree and kill dentries upon devpts_pty_kill().  That
      allows to get rid of a lot of code in fs/devpts/*.c.
      	* devpts_fs.h cleaned up.
      	* devpts/root.c and devpts/devpts_i.h removed.
      	* array of pointers to devpts inodes killed; with ramfs-style tree
      it's not needed anymore.
      	* devpts/inode.c cleaned up.
      	* devpts_pty_new() used to get mk_kdev() only to convert it to
      dev_t (hardly a surprise, since it's mknod() in disguise).  Now it gets
      dev_t as an argument.
      2aa85937
    • Linus Torvalds's avatar
      Merge home.transmeta.com:/home/torvalds/v2.5/akpm · 78f1f626
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      78f1f626
    • Andrew Morton's avatar
      [PATCH] Use names, not numbers for pagefault types · f1dfe022
      Andrew Morton authored
      This is Bill Irwin's cleanup patch which gives symbolic names to the
      fault types:
      
      	#define VM_FAULT_OOM	(-1)
      	#define VM_FAULT_SIGBUS	0
      	#define VM_FAULT_MINOR	1
      	#define VM_FAULT_MAJOR	2
      
      Only arch/i386 has been updated - other architectures can do this too.
      f1dfe022
    • Andrew Morton's avatar
      [PATCH] reduce lock contention in try_to_free_buffers() · 5feb041e
      Andrew Morton authored
      The blockdev mapping's private_lock is fairly contended.  The buffer
      LRU cache fixed a lot of that, but under page replacement load,
      try_to_free_buffers is still showing up.
      
      Moving the freeing of buffer_heads outside the lock reduces contention
      in there by 30%.
      5feb041e
    • Andrew Morton's avatar
      [PATCH] debug: check page refcount in __free_pages_ok() · 63a07153
      Andrew Morton authored
      Add a BUG() check to __free_pages_ok() - to catch someone freeing a
      page which has a non-zero refcount.  Actually, this check is mainly to
      catch someone (ie: shrink_cache()) incrementing a page's refcount
      shortly after it has been freed
      
      Also clean up __free_pages_ok() a bit and convert lots of BUGs to BUG_ON.
      63a07153
    • Andrew Morton's avatar
      [PATCH] fix invalidate_inode_pages2() race · ea66b69c
      Andrew Morton authored
      Fix a buglet in invalidate_list_pages2(): there is a small window in
      which writeback could start against the page before this function locks
      it.
      
      The patch closes the race by performing the PageWriteback test inside
      PageLocked.
      
      Testing PageWriteback inside PageLocked is "definitive" - when a page
      is locked, writeback cannot start against it.
      ea66b69c
    • Andrew Morton's avatar
      [PATCH] JBD commit callback capability · 8b00e4fa
      Andrew Morton authored
      This is a patch which Stephen has applied to ext3's 2.4 repository.
      Originally written by Andreas, generalised somewhat by Stephen.
      
      Add jbd callback mechanism, requested for InterMezzo.  We allow the jbd's
      client to request notification when a given handle's IO finally commits to
      disk, so that clients can manage their own writeback state asynchronously.
      8b00e4fa
    • Andrew Morton's avatar
      [PATCH] ext3 truncate fix · 66c1d66f
      Andrew Morton authored
      Forward-port of a fix which Stephen has applied to ext3's 2.4 CVS tree.
      
      Fix for a rare problem seen under stress in data=journal mode: if we
      have to restart a truncate transaction while traversing the inode's
      direct blocks, we need to deal with bh==NULL in ext3_clear_blocks.
      66c1d66f
    • Andrew Morton's avatar
      [PATCH] combine generic_writepages() and mpage_writepages() · c0902cac
      Andrew Morton authored
      generic_writepages and mpage_writepages are basically identical,
      except one calls ->writepage() and the other calls mpage_writepage().
      This duplication is irritating.
      
      The patch folds generic_writepage() into mpage_writepages().  It does
      this rather kludgily: if the get_block argument to mpage_writepages()
      is NULL then use ->writepage().
      
      Can't think of a better way, really - we could go for a fully-blown
      write_actor_t thing, but that would be overly elaborate and would not
      allow mpage_writepage() to be inlined inside mpage_writepages(), which
      is rather desirable.
      c0902cac
    • Andrew Morton's avatar
      [PATCH] fix a writeback race · 2ab9665b
      Andrew Morton authored
      Fixes a bug in generic_writepages() and its cut-n-paste-cousin,
      mpage_writepages().
      
      The code was clearing PageDirty and then baling out if it discovered
      the page was nder writeback.  Which would cause the dirty bit to be
      lost.
      
      It's a very small window, but reversing the order so PageDirty is only
      cleared when we know for-sure that IO will be started fixes it up.
      2ab9665b
    • Andrew Morton's avatar
      [PATCH] suppress more allocation failure warnings · 193ae036
      Andrew Morton authored
      The `page allocation failure' warning in __alloc_pages() is being a
      pain.  But I'm persisting with it...
      
      The patch renames PF_RADIX_TREE to PF_NOWARN, and uses it in a few
      places where allocations failures are known to happen.  These code
      paths are well-tested now and suppressing the warning is OK.
      193ae036
    • Andrew Morton's avatar
      [PATCH] always update page->flags atomically · a2b41d23
      Andrew Morton authored
      move_from_swap_cache() and move_to_swap_cache() are playing with
      page->flags nonatomically.  The page is on the LRU at the time and
      another CPU could be altering page->flags concurrently.
      
      The patch converts those functions to use atomic operations.
      
      It also rationalises the number of bits which are cleared.  It's not
      really clear to me what page flags we really want to set to a known
      state in there.
      
      It had no right to go clearing PG_arch_1.  I'm now clearing PG_arch_1
      inside rmqueue() which is still a bit presumptious.
      
      btw: shmem uses PAGE_CACHE_SIZE and swapper_space uses PAGE_SIZE.  I've
      been carefully maintaining the distinction, but it looks like shmem
      will break if we ever do make these values different.
      
      
      Also, __add_to_page_cache() was performing a non-atomic RMW against
      page->flags, under the assumption that it was a newly allocated page
      which no other CPU would look at.  Not true - this function is used for
      moving anon pages into swapcache.  Those anon pages are on the LRU -
      other CPUs can be performing operations against page->flags while
      __add_to_swap_cache is stomping on them.  This had me running around in
      circles for two days.
      
      So let's move the initialisation of the page state into rmqueue(),
      where the page really is new (could do it in page_cache_alloc,
      perhaps).
      
      The SetPageLocked() in __add_to_page_cache() is also rather curious.
      Seems OK for both pagecache and swapcache so I covered that with a
      comment.
      
      
      2.4 has the same problem.  Basically, add_to_swap_cache() can stomp on
      another CPU's manipulation of page->flags.  After a quick review of the
      code there, it is barely conceivable that a concurrent refill_inactve()
      could get its PG_referenced and PG_active bits scribbled on.  Rather
      unlikely because swap_out() will probably see PageActive() and bale
      out.
      
      Also, mark_dirty_kiobuf() could have its PG_dirty bit accidentally
      cleared (but try_to_swap_out() sets it again later).
      
      But there may be other code paths.  Really, I think this needs fixing
      in 2.4 - it's horrid.
      a2b41d23
    • Andrew Morton's avatar
      [PATCH] Use __GFP_HIGH in mpage_writepages() · a263b647
      Andrew Morton authored
      In mpage_writepage(), use __GFP_HIGH when allocating the BIO: writeback
      is a memory reclaim function and is entitle to dip into the page
      reserves to get its IO underway.
      a263b647
    • Andrew Morton's avatar
      [PATCH] resurrect __GFP_HIGH · 371151c9
      Andrew Morton authored
      This patch reinstates __GFP_HIGH functionality.
      
      __GFP_HIGH means "able to dip into the emergency pools".  However,
      somewhere along the line this got broken.  __GFP_HIGH ceased to do
      anything.  Instead, !__GFP_WAIT is used to tell the page allocator to
      try harder.
      
      __GFP_HIGH makes sense.  The concepts of "unable to sleep" and "should
      try harder" are quite separate, and overloading !__GFP_WAIT to mean
      "should access emergency pools" seems wrong.
      
      This patch fixes a problem in mempool_alloc().  mempool_alloc() tries
      the first allocation with __GFP_WAIT cleared.  If that fails, it tries
      again with __GFP_WAIT enabled (if the caller can support __GFP_WAIT).
      So it is currently performing an atomic allocation first, even though
      the caller said that they're prepared to go in and call the page
      stealer.
      
      I thought this was a mempool bug, but Ingo said:
      
      > no, it's not GFP_ATOMIC. The important difference is __GFP_HIGH, which
      > triggers the intrusive highprio allocation mode. Otherwise gfp_nowait is
      > just a nonblocking allocation of the same type as the original gfp_mask.
      > ...
      > what i've added is a bit more subtle allocation method, with both
      > performance and balancing-correctness in mind:
      >
      > 1. allocate via gfp_mask, but nonblocking
      > 2. if failure => try to get from the pool if the pool is 'full enough'.
      > 3. if failure => allocate with gfp_mask [which might block]
      >
      > there is performance data that this method improves bounce-IO performance
      > significantly, because even under VM pressure (when gfp_mask would block)
      > we can still use up to 50% of the memory pool without blocking (and
      > without endangering deadlock-free allocation). Ie. the memory pool is also
      > a fast 'frontside cache' of memory elements.
      
      Ingo was assuming that __GFP_HIGH was still functional.  It isn't, and the
      mempool design wants it.
      371151c9
    • Andrew Morton's avatar
      [PATCH] set_page_dirty() in mark_dirty_kiobuf() · 9bd6f86b
      Andrew Morton authored
      Yet another SetPageDirty/set_page_dirty bugfix: mark_dirty_kiobuf needs
      to run set_page_dirty() so the page goes onto its mapping's dirty_pages
      list.
      9bd6f86b
    • Andrew Morton's avatar
      [PATCH] check for O_DIRECT capability in open(), not write() · 6ef5d4bb
      Andrew Morton authored
      For O_DIRECT opens we're currently checking that the fs supports
      O_DIRECT at write(2)-time.
      
      This is a forward-port of Andrea's patch which moves the check to
      open() time.  Seems more sensible.
      6ef5d4bb
    • Andrew Morton's avatar
      [PATCH] set TASK_RUNNING in yield() · b5b6fa52
      Andrew Morton authored
      It seems that the yield() macro requires state TASK_RUNNING, but
      practically none of the callers remember to do that.
      
      The patch turns yield() into a real function which sets state
      TASK_RUNNING before scheduling.
      b5b6fa52
    • Andrew Morton's avatar
      [PATCH] set TASK_RUNNING in cond_resched() · b2bd3a26
      Andrew Morton authored
      do_select() does set_current_state(TASK_INTERRUPTIBLE) then calls
      __pollwait() which calls __get_free_page() and the cond_resched() which
      I added to the pagecache reclaim code never returns.
      
      The patch makes cond_resched() more useful by setting current->state to
      TASK_RUNNING before scheduling.
      b2bd3a26
    • Andrew Morton's avatar
      [PATCH] add new list_splice_init() · f42e6ed8
      Andrew Morton authored
      A little cleanup: Most callers of list_splice() immediately
      reinitialise the source list_head after calling list_splice().
      
      So create a new list_splice_init() which does all that.
      f42e6ed8
    • Andrew Morton's avatar
      [PATCH] shmem fixes · e7c89646
      Andrew Morton authored
      A shmem cleanup/bugfix patch from Hugh Dickins.
      
      - Minor: in try_to_unuse(), only wait on writeout if we actually
        started new writeout.  Otherwise, there is no need because a
        wait_on_page_writeback() has already been executed against this page.
        And it's locked, so no new writeback can start.
      
      - Minor: in shmem_unuse_inode(): remove all the
        wait_on_page_writeback() logic.  We already did that in
        try_to_unuse(), adn the page is locked so no new writeback can start.
      
      - Less minor: add a missing a page_cache_release() to
        shmem_get_page_locked() in the uncommon case where the page was found
        to be under writeout.
      e7c89646