Commits · cd2fbf89cd0083a99813ca985897a4e52d36ab4a · Kirill Smelkov / linux

An error occurred fetching the project authors.

23 Aug, 2004 3 commits

NFSv4: More aggressive caching if we have a delegation. · b42a8a16
Trond Myklebust authored 20 years ago
```
Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
```
b42a8a16

NFSv2/v3/v4: Place NFS nfs_page shared data into a single structure · 6caf69fe

Trond Myklebust authored 20 years ago

   that hangs off filp->private_data. As a side effect, this also
   cleans up the NFSv4 private file state info.
Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>

6caf69fe

[PATCH] Fix posix file locking (9/9) · f6297acf

Trond Myklebust authored 20 years ago

NFSv2/v3: Fix up a race in the case where the user presses ^C while a
   process is in the middle of setting up a posix lock. In case the
   server registered our lock, we need to make sure that it gets
   cleaned up during the resulting file close().
Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f6297acf

13 Aug, 2004 3 commits

[PATCH] Fix NFS client screw-up in fcntl f_op removal · 031ded82
Jeff Garzik authored 20 years ago
```
Fix stupid thinkos in the fcntl f_op removal code.
```
031ded82
Fix stupid thinkos in the fcntl f_op removal code. · 99b75d35
Linus Torvalds authored 20 years ago
```
Tssk. 
```
99b75d35

[PATCH] Remove fcntl f_op · 401f0fbd

Matthew Wilcox authored 20 years ago

The newly introduced ->fcntl file_operation is badly thought out,
not to mention undocumented.  This patch replaces it with two better
defined operations -- check_flags and dir_notify.  Any other fcntl()s
that filesystems are interested in can have their own properly typed
f_op method when they need it.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

401f0fbd

14 Jul, 2004 1 commit

[PATCH] sparse: read_descriptor_t annotation · 790d43fd

Alexander Viro authored 20 years ago

We have a fun situation with read_descriptor_t - all its instances end
up passed to some actor; these actors use desc->buf as their private
data; there are 5 of them and they expect resp:

        struct lo_read_data *
        struct svc_rqst *
        struct file *
        struct rpc_xprt *
        char __user *

IOW, there is no type safety whatsoever; the field is essentially untyped,
we rely on the fact that actor is chosen by the same code that sets ->buf
and expect it to put something of the right type there.

Right now desc->buf is declared as char __user *.  Moreover, the last
argument of ->sendfile() (what should be stored in ->buf) is void __user *,
even though it's actually _never_ a userland pointer.

If nothing else, ->sendfile() should take void * instead; that alone removes
a bunch of bogus warnings.  I went further and replaced desc->buf with a
union of void * and char __user *.

790d43fd

29 May, 2004 1 commit
- [PATCH] sparse: nfs __user annotation (client only, and not touching RPC) · d608b1b2
  Alexander Viro authored 20 years ago
  
  d608b1b2
20 May, 2004 1 commit

NFS O_DIRECT: Change the NFS O_DIRECT path so that it · 78f673ca

Trond Myklebust authored 20 years ago

    no longer calls the generic VFS read and write routines.
    This allows all application read requests to pass through
    to the server, instead of just the ones that appear to be
    inside the file. this eliminates the requirement to use a
    GETATTR operation before each read or write to determine
    where the EOF is. This is a significant performance and
    scalability win.

    It also removes all requirements for holding the inode
    semaphore during NFS direct reads and writes, as the read
    and write logic no longer needs atomic access to the size
    of the file. this also helps client CPU scalability by
    reducing the serialization of writes against a single file.
                                                                                
    Patch by Chuck Lever

78f673ca

12 Apr, 2004 1 commit

[PATCH] add file_operations.fcntl · cea39746

Andrew Morton authored 20 years ago

From: Chuck Lever <cel@citi.umich.edu>

O_DIRECT|O_APPEND cannot possibly work on NFS, so NFS needs some way of
preventing the user from setting this combination.  We felt that the best
way of implementing this restriction is to allow the filesytem to implement
its own fcntl() handler.

This patch does, that, and provide the appropriate handler for NFS.

Additional details from Chuck:

Forgetting O_DIRECT for a moment, O_APPEND writes on NFS don't work in any
case when multiple clients are writing to a file, since an NFS client can
never guarantee it knows where the true end of file is 100% of the time.
it works as expected iff only one client writes to an O_APPEND file at a
time.

Multi-client O_APPEND writing doesn't seem to be a problem for any
application I'm aware of.  Since it can be made to behave in the
multi-client case with careful application logic or by using file locking,
I don't think we should disallow it.

I want to drop the inode semaphore when doing NFS direct I/O because it is
synchronous; holding the i_sem means we reduce direct I/O concurrency to
one I/O per file at a time.  the important thing sct was worried about was
the case where a single client is writing with O_APPEND and O_DIRECT, and
we don't hold the i_sem during the write.

We must at least hold the i_sem when determining where the end of file is
to do the O_APPEND write.  In 2.6, I believe that is handled correctly in
the VFS layer, so this is not an issue for 2.6, right?

cea39746

13 Mar, 2004 2 commits

NFSv2/v3/v4: Ensure that fsync() flushes all writebacks to disk rather than just the · 680c0ee6
Trond Myklebust authored 20 years ago
```
      ones labelled as belonging to our file. This fixes a bug in which msync(MS_SYNC)
      will fail to flush the pages to disk.
```
680c0ee6

NFSv2/v3/v4: New attribute revalidation code that no · ca9268fe

Trond Myklebust authored 20 years ago

     longer relies on ctime for correctness in avoiding
     update races.

VFS: allow filesystems to disable inode_update_time() on
     a per-inode basis.

ca9268fe

07 Feb, 2004 1 commit
- NFSv4: Add support for POSIX file locking. · 3f1990d3
  Trond Myklebust authored 20 years ago
  
  3f1990d3
19 Jan, 2004 1 commit

[PATCH] bdev: use correct mapping's i_sem · 54df7662

Andrew Morton authored 21 years ago

From: viro@parcelfarce.linux.theplanet.co.uk <viro@parcelfarce.linux.theplanet.co.uk>

In a bunch of places we used file->f_dentry->d_inode->i_sem to protect
fdatasync et.al. Replaced with corrent file->f_mapping->host->i_sem - the
object we are protecting is address_space, so we want an exclusion that would
work for redirected ->i_mapping. For normal files (not coda, not bdev) it's
all the same, of course - there we have

file->f_mapping->host == file->f_dentry->d_inode

and change above is an equivalent transfromation.

54df7662

07 Oct, 2003 1 commit

NFSv4 state model update · 1a7bc914

Trond Myklebust authored 21 years ago

  - Hierarchy of state attached to nfs4_client in order to
    simplify state recovery.
    state_owners hang off nfs4_client, whereas state hangs
    off both open_owners and the nfs_inode.
    
  - Model tries to minimize the number of open_owners on
    the server by recycling unused open_owners into a pool.
    
  - NFSv4 state attached to file->private_data. Previously
    this was used by credentials (and still is for NFSv2/v3)
    Abstract out setup/release of struct file and nfs_page
    structure initialization in order to cope with these
    conflicting uses of private_data.

1a7bc914

04 Jul, 2003 1 commit

[PATCH] Use the intents in 'nameidata' to improve NFS close-to-open consistency · 52d1430d

Trond Myklebust authored 21 years ago

  - Make use of the open intents to improve close-to-open
    cache consistency. Only force data cache revalidation when
    we're doing an open().

  - Add true exclusive create to NFSv3.

  - Optimize away the redundant ->lookup() to check for an
    existing file when we know that we're doing NFSv3 exclusive
    create.

  - Optimize away all ->permission() checks other than those for
    path traversal, open(), and sys_access().

52d1430d

02 Jul, 2003 1 commit

[PATCH] remove lock_kernel() from file_ops.flush() · e90f7e03

Andrew Morton authored 21 years ago

Rework the file_ops.flush() API sothat it is no longer called under
lock_kernel().  Push lock_kernel() down to all impementations except CIFS,
which doesn't want it.

e90f7e03

07 May, 2003 1 commit
- Fix typos in close-to-open cache consistency checking. · 1a961d01
  Trond Myklebust authored 21 years ago
  
  1a961d01
08 Apr, 2003 1 commit
- Prepare for the introduction of NFSv4 state code. · ec059472
  Trond Myklebust authored 21 years ago
```
Split out the open() method for regular files from that of
directories.
```
  ec059472
08 Mar, 2003 1 commit

[PATCH] Implement sendfile() for NFS · 731cf67c

Andrew Morton authored 21 years ago

Patch from Trond Myklebust <trond.myklebust@fys.uio.no>

Implement sendfile() for the NFS client.  This is required for loop-on-NFS
support.

731cf67c

20 Dec, 2002 1 commit

[PATCH] give NFS client a "set_page_dirty" address space op. · 756e3174

Chuck Lever authored 22 years ago

Description:
  The default set_page_dirty address space op is too heavyweight for NFS,
  which doesn't use buffers.

756e3174

08 Nov, 2002 1 commit

[PATCH] Add nfs_writepages & backing_dev... · df43f015

Trond Myklebust authored 22 years ago

The following patch adds a simple ->writepages method that interprets
the extra information passed down in Andrew's writeback_control
structure, and translates it into nfs-speak.

It also adds a backing_dev_info structure that scales the readahead in
terms of the rsize. Maximum readahead is still 128k if you use 32k
rsize, but it is scaled down to 4k if you use 1k rsize.

df43f015

05 Nov, 2002 1 commit

[PATCH] Convert NFS client to use ->readpages() · b9a2dd76

Trond Myklebust authored 22 years ago

  - Add the library function read_cache_pages(), which is used in a
    similar fashion to the single page 'read_cache_page()'. It hides
    the details of the LRU cache etc. from a filesystem that wants to
    to populate an address space with a list of pages.

  - Fix NFS so that readahead uses the ->readpages() interface. Means
    that we can immediately schedule an RPC call in order to complete
    the I/O, rather than relying on somebody later triggering it by
    calling lock_page() (and hence sync_page()). The sync_page()
    method is race-prone, since the waiting page may try to call it
    before we've finished initializing the 'struct nfs_page'.

  - Clear out nfs_sync_page(), the nfs_inode->read list, and
    friends. When the I/O completion gets scheduled in ->readpage(),
    ->readpages(), they have no reason to exist.

b9a2dd76

15 Oct, 2002 1 commit

[PATCH] A basic NFSv4 client for 2.5.x · bf5344dc

Trond Myklebust authored 22 years ago

Now that all the hooks are in place, this large patch imports all
of the new code for the NFSv4 client.
  nfs4proc.c   - procedure vectors
  nfs4xdr.c    - XDR
  nfs4state.c  - state bookkeeping (very minimal for now)
  nfs4renewd.c - a daemon (implemented as an rpc_task) to keep
                 state from expiring on the server

Note: The RPCSEC_GSS authentication code is not yet included here.
  For the moment we make do with AUTH_UNIX aka. AUTH_SYS.

  Neither is the code to do upcalls to userland in order to do
  uid/gid <-> name mappings. Instead, stubs have been added to
  translate everything to 'nobody:nobody' == '-2:-2'

bf5344dc

07 Oct, 2002 2 commits

[PATCH] initial support for NFS direct I/O for 2.5 · c03e7607

Chuck Lever authored 22 years ago

This adds initial support for NFS direct I/O in the 2.5 kernel.  many
have asked for this support to be included in 2.5.  this patch does not
provide working NFS direct I/O, but i'm sending what i have now so that
it can be included before October 20.

NFS direct I/O is enabled by its very own kernel config option.  when
enabled, the NFS client won't build to prevent people from using this and
possibly corrupting their NFS files.  later i will send a patch that
finishes the implementation.

[ Config option currently disabled ]

c03e7607

[PATCH] remove NFS client internal dependence on page->index · eb582eba

Chuck Lever authored 22 years ago

This makes the NFS client copy the page->index field into its read and
write request structures (struct nfs_page) when setting up I/O on a
page.  this makes it possible for NFS direct I/O support to reuse
existing NFS client subroutines, and helps eventually allow NFS I/O to
and from anonymous pages.  it is a prerequisite for NFS direct I/O
support.

eb582eba

24 Sep, 2002 1 commit
- several updates for testing aio_{read,write} · e828d709
  Benjamin LaHaise authored 22 years ago
```
support for file descriptors with only async ops in vfs_{read,write}
```
  e828d709
10 Sep, 2002 1 commit
- [PATCH] designated initializer patches for fs_nfs · af692bd6
  Art Haas authored 22 years ago
```
  Here are some patches for C99 initializers in fs/nfs. Patches
  are against 2.5.32.
```
  af692bd6
30 Aug, 2002 1 commit

[PATCH] writeback correctness and efficiency changes · ec12ac49

Andrew Morton authored 22 years ago

This is a performance and correctness fix against the writeback paths.

The writeback code has competing requirements.  Sometimes it is used
for "memory cleansing": kupdate, bdflush, writer throttling, page
allocator writeback, etc.  And sometimes this same code is used for
data integrity pruposes: fsync, msync, fdatasync, sync, umount, various
other kernel-internal uses.

The problem is: how to handle a dirty buffer or page which is currently
under writeback.

For memory cleansing, we just want to skip that buffer/page and go onto
the next one.  But for sync, we must wait on the old writeback and then
start new writeback.

mpage_writepages() is current correct for cleansing, but incorrect for
sync.  block_write_full_page() is currently correct for sync, but
inefficient for cleansing.

The fix is fairly simple.

- In mpage_writepages(), don't skip the page is it's a sync
operation.

- In block_write_full_page(), skip the buffer if it is a sync
operation.  And return -EAGAIN to tell the caller that the writeout
didn't work out.  The caller must then set the page dirty again and
move it onto mapping->dirty_pages.

This is an extension of the writepage API: writepage can now return
EAGAIN.  There are only three callers, and they have been updated.

fail_writepage() and ext3_writepage() were actually doing this by
hand.  They have been changed to return -EAGAIN.  NTFS will want to
be able to return -EAGAIN from its writepage as well.

- A sticky question is: how to tell the writeout code which mode it
is operating in?  Cleansing or sync?

It's such a tiny code change that I didn't have the heart to go and
propagate a `mode' argument down every instance of writepages() and
writepage() in the kernel.  So I passed it in via current->flags.

Incidentally, the occurrence of a locked-and-dirty buffer in
block_write_full_page() is fairly rare: normally the collision avoidance
happens at the address_space level, via PageWriteback.  But some
mappings (blockdevs, ext3 files, etc) have their dirty buffers written
out via submit_bh().  It is these buffers which can stall
block_write_full_page().

This wart will be pretty intrusive to fix.  ext3 needs to become fully
page-based (ugh.  It's a block-based journalling filesystem, and pages
are unnatural).  blockdev mappings are still written out by buffers
because that's how filesystems use them.  Putting _all_ metadata
(indirects, inodes, superblocks, etc) into standalone address_spaces
would fix that up.

- filemap_fdatawrite() sets PF_SYNC.  So filemap_fdatawrite() is the
kernel function which will start writeback against a mapping for
"data integrity" purposes, whereas the unexported, internal-only
do_writepages() is the writeback function which is used for memory
cleansing.  This difference is the reason why I didn't consolidate
those functions ages ago...

- Lots of code paths had a bogus extra call to filemap_fdatawait(),
which I previously added in a moment of weak-headedness.  They have
all been removed.

ec12ac49

12 Jun, 2002 1 commit

[PATCH] fs/locks.c cleanup · 62737480

Matthew Wilcox authored 22 years ago

 - Inline locks_notify_blocked.
 - Remove a couple of now-bogus comments.
 - Remove the obsolete F_SHLCK and F_EXLCK cases.
 - Remove the last remaining reference to FL_BROKEN.

62737480

22 May, 2002 1 commit
- [PATCH] kill ->i_op->revalidate() · cc41b90f
  Alexander Viro authored 22 years ago
```
kill ->i_op->revalidate()
```
  cc41b90f
30 Apr, 2002 1 commit

[PATCH] page writeback locking update · a2bcb3a0

Andrew Morton authored 22 years ago

- Fixes a performance problem - callers of
  prepare_write/commit_write, etc are locking pages, which synchronises
  them behind writeback, which also locks these pages.  Significant
  slowdowns for some workloads.

- So pages are no longer locked while under writeout.  Introduce a
  new PG_writeback and associated infrastructure to support this design
  change.

- Pages which are under read I/O still use PageLocked.  Pages which
  are under write I/O have PageWriteback() true.

  I considered creating Page_IO instead of PageWriteback, and marking
  both readin and writeout pages as PageIO().  So pages are unlocked
  during both read and write.  There just doesn't seem a need to do
  this - nobody ever needs unblocking access to a page which is under
  read I/O.

- Pages under swapout (brw_page) are PageLocked, not PageWriteback.
  So their treatment is unchangeded.

  It's not obvious that pages which are under swapout actually need
  the more asynchronous behaviour of PageWriteback.

  I was setting the swapout pages PageWriteback and unlocking them
  prior to submitting the buffers in brw_page().  This led to deadlocks
  on the exit_mmap->zap_page_range->free_swap_and_cache path.  These
  functions call block_flushpage under spinlock.  If the page is
  unlocked but has locked buffers, block_flushpage->discard_buffer()
  sleeps.  Under spinlock.  So that will need fixing if for some reason
  we want swapout to use PageWriteback.

  Kernel has called block_flushpage() under spinlock for a long time.
   It is assuming that a locked page will never have locked buffers.
  This appears to be true, but it's ugly.

- Adds new function wait_on_page_writeback().  Renames wait_on_page()
  to wait_on_page_locked() to remind people that they need to call the
  appropriate one.

- Renames filemap_fdatasync() to filemap_fdatawrite().  It's more
  accurate - "sync" implies, if anything, writeout and wait.  (fsync,
  msync) Or writeout.  it's not clear.

- Subtly changes the filemap_fdatawrite() internals - this function
  used to do a lock_page() - it waited for any other user of the page
  to let go before submitting new I/O against a page.  It has been
  changed to simply skip over any pages which are currently under
  writeback.

  This is the right thing to do for memory-cleansing reasons.

  But it's the wrong thing to do for data consistency operations (eg,
  fsync()).  For those operations we must ensure that all data which
  was dirty *at the time of the system call* are tight on disk before
  the call returns.

  So all places which care about this have been converted to do:

	filemap_fdatawait(mapping);	/* Wait for current writeback */
	filemap_fdatawrite(mapping);	/* Write all dirty pages */
	filemap_fdatawait(mapping);	/* Wait for I/O to complete */

- Fixes a truncate_inode_pages problem - truncate currently will
  block when it hits a locked page, so it ends up getting into lockstep
  behind writeback and all of the file is pointlessly written back.

  One fix for this is for truncate to simply walk the page list in the
  opposite direction from writeback.

  I chose to use a separate cleansing pass.  It is more
  CPU-intensive, but it is surer and clearer.  This is because there is
  no reason why the per-address_space ->vm_writeback and
  ->writeback_mapping functions *have* to perform writeout in
  ->dirty_pages order.  They may choose to do something totally
  different.

  (set_page_dirty() is an a_op now, so address_spaces could almost
  privatise the whole dirty-page handling thing.  Except
  truncate_inode_pages and invalidate_inode_pages assume that the pages
  are on the address_space lists.  hmm.  So making truncate_inode_pages
  and invalidate_inode_pages a_ops would make some sense).

a2bcb3a0

18 Feb, 2002 1 commit

[PATCH] msync correctness · 1c000719

Andrew Morton authored 22 years ago

A forward port.  At present, msync() does not report errors
from EIO or ENOSPC.  fsync() has the same bug for mapped pages
against the affected fd.

The patch correctly propagates these errors back up from
writepage so that fsync and msync correctly report errors.

It's fairly important - msync is the only way we have
of reporting ENOSPC against sparse mappings.

Of course, you can still silently lose your data if it's kswapd who
gets ENOSPC during writepage.  I have 3/4 of a patch for that.  It
records the data loss so that a later msync() will report the bad
news.

This patch also adds an implementation of msync(MS_ASYNC), because
it was easy.

1c000719

09 Feb, 2002 1 commit

[PATCH] includes cleanup, 2nd try. · 7021dc36

Dave Jones authored 22 years ago

Big bits first, I'll redo the smaller bits tomorrow after some sleep.
Same as last time, rediffed against pre5

7021dc36

06 Feb, 2002 1 commit

[PATCH] 2.5.4-pre1: further llseek cleanup (1/3) · 5284a260

Robert Love authored 22 years ago

This is the first of three patches implementing further llseek cleanup,
against 2.5.4-pre1.

The 'push locking into llseek methods' patch was integrated into 2.5.3.
The networking filesystems, however, do not protect i_size and can not
rely on the inode semaphore used in generic_file_llseek.

This patch implements a remote_llseek method, which is basically the
pre-2.5.3 version of generic_file_llseek.  Locking is done via the BKL.
When we have a saner locking system in place, we can push it into this
function in lieu.

Ncpfs, nfs, and smbfs have been converted to use this new llseek.

Note this is updated over the previous posted patch.

	Robert Love

5284a260

05 Feb, 2002 5 commits

v2.5.1.7 -> v2.5.1.8 · 2161cc3b

Linus Torvalds authored 22 years ago

- Greg KH: USB updates
- various: kdev_t updates
- Al Viro: more bread()/filesystem cleanups

2161cc3b

v2.5.1.5 -> v2.5.1.6 · a914dd8b

Linus Torvalds authored 22 years ago

- Davide Libenzi: nicer timeslices for scheduler
- Arnaldo: wd7000 scsi driver cleanups and bio update
- Greg KH: USB update (including initial 2.0 support)
- me: strict typechecking on "kdev_t"

a914dd8b

v2.5.1 -> v2.5.1.1 · 0925bad3

Linus Torvalds authored 22 years ago

- me: revert the "kill(-1..)" change.  POSIX isn't that clear on the
issue anyway, and the new behaviour breaks things.
- Jens Axboe: more bio updates
- Al Viro: rd_load cleanups. hpfs mount fix, mount cleanups
- Ingo Molnar: more raid updates
- Jakub Jelinek: fix Linux/x86 confusion about arg passing of "save_v86_state" and "do_signal"
- Trond Myklebust: fix NFS client race conditions

0925bad3

v2.4.9.15 -> v2.4.10 · 8c7cba55

Linus Torvalds authored 22 years ago

  - Andrew Grover: ACPI update
  - Al Viro: block devices..
  - Andrea Arcangeli: fix list manipulation bogosity
  - Trond Myklebust: 64-bit file locking fixes
  - Brad Hards: USB CDC ethernet
  - Chris Mason: reiserfs speedup
  - Robert Love: re-merge AMD 761 GART support that was lost in -ac merge
  - Adam Richter: check pci_module_init() return value

8c7cba55

v2.4.9.4 -> v2.4.9.5 · 1c3cefa5

Linus Torvalds authored 22 years ago

  - Merge with Alan
  - Trond Myklebust: NFS fixes - kmap and root inode special case
  - Al Viro: more superblock cleanups, inode leak in rd.c, minix
  directories in page cache
  - Paul Mackerras: clean up rubbish from sl82c105.c
  - Neil Brown: md/raid cleanups, NFS filehandles
  - Johannes Erdfelt: USB update (usb-2.0 support, visor fix, Clie fix,
  pl2303 driver update)
  - David Miller: sparc and net update
  - Eric Biederman: simplify and correct bootdata allocation - don't
  overwrite ramdisks
  - Tim Waugh: support multiple SuperIO devices, parport doc updates

1c3cefa5