Commit 541010e4 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'locks' of git://linux-nfs.org/~bfields/linux

* 'locks' of git://linux-nfs.org/~bfields/linux:
  nfsd: remove IS_ISMNDLCK macro
  Rework /proc/locks via seq_files and seq_list helpers
  fs/locks.c: use list_for_each_entry() instead of list_for_each()
  NFS: clean up explicit check for mandatory locks
  AFS: clean up explicit check for mandatory locks
  9PFS: clean up explicit check for mandatory locks
  GFS2: clean up explicit check for mandatory locks
  Cleanup macros for distinguishing mandatory locks
  Documentation: move locks.txt in filesystems/
  locks: add warning about mandatory locking races
  Documentation: move mandatory locking documentation to filesystems/
  locks: Fix potential OOPS in generic_setlease()
  Use list_first_entry in locks_wake_up_blocks
  locks: fix flock_lock_file() comment
  Memory shortage can result in inconsistent flocks state
  locks: kill redundant local variable
  locks: reverse order of posix_locks_conflict() arguments
parents e457f790 5e7fc436
...@@ -145,7 +145,7 @@ fb/ ...@@ -145,7 +145,7 @@ fb/
feature-removal-schedule.txt feature-removal-schedule.txt
- list of files and features that are going to be removed. - list of files and features that are going to be removed.
filesystems/ filesystems/
- directory with info on the various filesystems that Linux supports. - info on the vfs and the various filesystems that Linux supports.
firmware_class/ firmware_class/
- request_firmware() hotplug interface info. - request_firmware() hotplug interface info.
floppy.txt floppy.txt
...@@ -230,8 +230,6 @@ local_ops.txt ...@@ -230,8 +230,6 @@ local_ops.txt
- semantics and behavior of local atomic operations. - semantics and behavior of local atomic operations.
lockdep-design.txt lockdep-design.txt
- documentation on the runtime locking correctness validator. - documentation on the runtime locking correctness validator.
locks.txt
- info on file locking implementations, flock() vs. fcntl(), etc.
logo.gif logo.gif
- full colour GIF image of Linux logo (penguin - Tux). - full colour GIF image of Linux logo (penguin - Tux).
logo.txt logo.txt
...@@ -240,8 +238,6 @@ m68k/ ...@@ -240,8 +238,6 @@ m68k/
- directory with info about Linux on Motorola 68k architecture. - directory with info about Linux on Motorola 68k architecture.
magic-number.txt magic-number.txt
- list of magic numbers used to mark/protect kernel data structures. - list of magic numbers used to mark/protect kernel data structures.
mandatory.txt
- info on the Linux implementation of Sys V mandatory file locking.
mca.txt mca.txt
- info on supporting Micro Channel Architecture (e.g. PS/2) systems. - info on supporting Micro Channel Architecture (e.g. PS/2) systems.
md.txt md.txt
......
...@@ -52,6 +52,10 @@ isofs.txt ...@@ -52,6 +52,10 @@ isofs.txt
- info and mount options for the ISO 9660 (CDROM) filesystem. - info and mount options for the ISO 9660 (CDROM) filesystem.
jfs.txt jfs.txt
- info and mount options for the JFS filesystem. - info and mount options for the JFS filesystem.
locks.txt
- info on file locking implementations, flock() vs. fcntl(), etc.
mandatory-locking.txt
- info on the Linux implementation of Sys V mandatory file locking.
ncpfs.txt ncpfs.txt
- info on Novell Netware(tm) filesystem using NCP protocol. - info on Novell Netware(tm) filesystem using NCP protocol.
ntfs.txt ntfs.txt
......
...@@ -53,11 +53,11 @@ fcntl(), with all the problems that implies. ...@@ -53,11 +53,11 @@ fcntl(), with all the problems that implies.
1.3 Mandatory Locking As A Mount Option 1.3 Mandatory Locking As A Mount Option
--------------------------------------- ---------------------------------------
Mandatory locking, as described in 'Documentation/mandatory.txt' was prior Mandatory locking, as described in 'Documentation/filesystems/mandatory.txt'
to this release a general configuration option that was valid for all was prior to this release a general configuration option that was valid for
mounted filesystems. This had a number of inherent dangers, not the least all mounted filesystems. This had a number of inherent dangers, not the
of which was the ability to freeze an NFS server by asking it to read a least of which was the ability to freeze an NFS server by asking it to read
file for which a mandatory lock existed. a file for which a mandatory lock existed.
From this release of the kernel, mandatory locking can be turned on and off From this release of the kernel, mandatory locking can be turned on and off
on a per-filesystem basis, using the mount options 'mand' and 'nomand'. on a per-filesystem basis, using the mount options 'mand' and 'nomand'.
......
...@@ -3,7 +3,26 @@ ...@@ -3,7 +3,26 @@
Andy Walker <andy@lysaker.kvaerner.no> Andy Walker <andy@lysaker.kvaerner.no>
15 April 1996 15 April 1996
(Updated September 2007)
0. Why you should avoid mandatory locking
-----------------------------------------
The Linux implementation is prey to a number of difficult-to-fix race
conditions which in practice make it not dependable:
- The write system call checks for a mandatory lock only once
at its start. It is therefore possible for a lock request to
be granted after this check but before the data is modified.
A process may then see file data change even while a mandatory
lock was held.
- Similarly, an exclusive lock may be granted on a file after
the kernel has decided to proceed with a read, but before the
read has actually completed, and the reading process may see
the file data in a state which should not have been visible
to it.
- Similar races make the claimed mutual exclusion between lock
and mmap similarly unreliable.
1. What is mandatory locking? 1. What is mandatory locking?
------------------------------ ------------------------------
......
...@@ -105,7 +105,7 @@ static int v9fs_file_lock(struct file *filp, int cmd, struct file_lock *fl) ...@@ -105,7 +105,7 @@ static int v9fs_file_lock(struct file *filp, int cmd, struct file_lock *fl)
P9_DPRINTK(P9_DEBUG_VFS, "filp: %p lock: %p\n", filp, fl); P9_DPRINTK(P9_DEBUG_VFS, "filp: %p lock: %p\n", filp, fl);
/* No mandatory locks */ /* No mandatory locks */
if ((inode->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID) if (__mandatory_lock(inode))
return -ENOLCK; return -ENOLCK;
if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) { if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) {
......
...@@ -524,8 +524,7 @@ int afs_lock(struct file *file, int cmd, struct file_lock *fl) ...@@ -524,8 +524,7 @@ int afs_lock(struct file *file, int cmd, struct file_lock *fl)
(long long) fl->fl_start, (long long) fl->fl_end); (long long) fl->fl_start, (long long) fl->fl_end);
/* AFS doesn't support mandatory locks */ /* AFS doesn't support mandatory locks */
if ((vnode->vfs_inode.i_mode & (S_ISGID | S_IXGRP)) == S_ISGID && if (__mandatory_lock(&vnode->vfs_inode) && fl->fl_type != F_UNLCK)
fl->fl_type != F_UNLCK)
return -ENOLCK; return -ENOLCK;
if (IS_GETLK(cmd)) if (IS_GETLK(cmd))
......
...@@ -535,7 +535,7 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl) ...@@ -535,7 +535,7 @@ static int gfs2_lock(struct file *file, int cmd, struct file_lock *fl)
if (!(fl->fl_flags & FL_POSIX)) if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK; return -ENOLCK;
if ((ip->i_inode.i_mode & (S_ISGID | S_IXGRP)) == S_ISGID) if (__mandatory_lock(&ip->i_inode))
return -ENOLCK; return -ENOLCK;
if (sdp->sd_args.ar_localflocks) { if (sdp->sd_args.ar_localflocks) {
...@@ -636,7 +636,7 @@ static int gfs2_flock(struct file *file, int cmd, struct file_lock *fl) ...@@ -636,7 +636,7 @@ static int gfs2_flock(struct file *file, int cmd, struct file_lock *fl)
if (!(fl->fl_flags & FL_FLOCK)) if (!(fl->fl_flags & FL_FLOCK))
return -ENOLCK; return -ENOLCK;
if ((ip->i_inode.i_mode & (S_ISGID | S_IXGRP)) == S_ISGID) if (__mandatory_lock(&ip->i_inode))
return -ENOLCK; return -ENOLCK;
if (sdp->sd_args.ar_localflocks) if (sdp->sd_args.ar_localflocks)
......
This diff is collapsed.
...@@ -577,8 +577,7 @@ static int nfs_lock(struct file *filp, int cmd, struct file_lock *fl) ...@@ -577,8 +577,7 @@ static int nfs_lock(struct file *filp, int cmd, struct file_lock *fl)
nfs_inc_stats(inode, NFSIOS_VFSLOCK); nfs_inc_stats(inode, NFSIOS_VFSLOCK);
/* No mandatory locks over NFS */ /* No mandatory locks over NFS */
if ((inode->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID && if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
fl->fl_type != F_UNLCK)
return -ENOLCK; return -ENOLCK;
if (IS_GETLK(cmd)) if (IS_GETLK(cmd))
......
...@@ -2035,7 +2035,7 @@ static inline int ...@@ -2035,7 +2035,7 @@ static inline int
io_during_grace_disallowed(struct inode *inode, int flags) io_during_grace_disallowed(struct inode *inode, int flags)
{ {
return nfs4_in_grace() && (flags & (RD_STATE | WR_STATE)) return nfs4_in_grace() && (flags & (RD_STATE | WR_STATE))
&& MANDATORY_LOCK(inode); && mandatory_lock(inode);
} }
/* /*
......
...@@ -61,12 +61,6 @@ ...@@ -61,12 +61,6 @@
#define NFSDDBG_FACILITY NFSDDBG_FILEOP #define NFSDDBG_FACILITY NFSDDBG_FILEOP
/* We must ignore files (but only files) which might have mandatory
* locks on them because there is no way to know if the accesser has
* the lock.
*/
#define IS_ISMNDLK(i) (S_ISREG((i)->i_mode) && MANDATORY_LOCK(i))
/* /*
* This is a cache of readahead params that help us choose the proper * This is a cache of readahead params that help us choose the proper
* readahead strategy. Initially, we set all readahead parameters to 0 * readahead strategy. Initially, we set all readahead parameters to 0
...@@ -689,7 +683,12 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, ...@@ -689,7 +683,12 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
err = nfserr_perm; err = nfserr_perm;
if (IS_APPEND(inode) && (access & MAY_WRITE)) if (IS_APPEND(inode) && (access & MAY_WRITE))
goto out; goto out;
if (IS_ISMNDLK(inode)) /*
* We must ignore files (but only files) which might have mandatory
* locks on them because there is no way to know if the accesser has
* the lock.
*/
if (S_ISREG((inode)->i_mode) && mandatory_lock(inode))
goto out; goto out;
if (!inode->i_fop) if (!inode->i_fop)
......
...@@ -66,7 +66,6 @@ extern int get_stram_list(char *); ...@@ -66,7 +66,6 @@ extern int get_stram_list(char *);
extern int get_filesystem_list(char *); extern int get_filesystem_list(char *);
extern int get_exec_domain_list(char *); extern int get_exec_domain_list(char *);
extern int get_dma_list(char *); extern int get_dma_list(char *);
extern int get_locks_status (char *, char **, off_t, int);
static int proc_calc_metrics(char *page, char **start, off_t off, static int proc_calc_metrics(char *page, char **start, off_t off,
int count, int *eof, int len) int count, int *eof, int len)
...@@ -624,16 +623,18 @@ static int cmdline_read_proc(char *page, char **start, off_t off, ...@@ -624,16 +623,18 @@ static int cmdline_read_proc(char *page, char **start, off_t off,
return proc_calc_metrics(page, start, off, count, eof, len); return proc_calc_metrics(page, start, off, count, eof, len);
} }
static int locks_read_proc(char *page, char **start, off_t off, static int locks_open(struct inode *inode, struct file *filp)
int count, int *eof, void *data)
{ {
int len = get_locks_status(page, start, off, count); return seq_open(filp, &locks_seq_operations);
if (len < count)
*eof = 1;
return len;
} }
static const struct file_operations proc_locks_operations = {
.open = locks_open,
.read = seq_read,
.llseek = seq_lseek,
.release = seq_release,
};
static int execdomains_read_proc(char *page, char **start, off_t off, static int execdomains_read_proc(char *page, char **start, off_t off,
int count, int *eof, void *data) int count, int *eof, void *data)
{ {
...@@ -691,7 +692,6 @@ void __init proc_misc_init(void) ...@@ -691,7 +692,6 @@ void __init proc_misc_init(void)
#endif #endif
{"filesystems", filesystems_read_proc}, {"filesystems", filesystems_read_proc},
{"cmdline", cmdline_read_proc}, {"cmdline", cmdline_read_proc},
{"locks", locks_read_proc},
{"execdomains", execdomains_read_proc}, {"execdomains", execdomains_read_proc},
{NULL,} {NULL,}
}; };
...@@ -709,6 +709,7 @@ void __init proc_misc_init(void) ...@@ -709,6 +709,7 @@ void __init proc_misc_init(void)
entry->proc_fops = &proc_kmsg_operations; entry->proc_fops = &proc_kmsg_operations;
} }
#endif #endif
create_seq_entry("locks", 0, &proc_locks_operations);
create_seq_entry("devices", 0, &proc_devinfo_operations); create_seq_entry("devices", 0, &proc_devinfo_operations);
create_seq_entry("cpuinfo", 0, &proc_cpuinfo_operations); create_seq_entry("cpuinfo", 0, &proc_cpuinfo_operations);
#ifdef CONFIG_BLOCK #ifdef CONFIG_BLOCK
......
...@@ -205,7 +205,7 @@ int rw_verify_area(int read_write, struct file *file, loff_t *ppos, size_t count ...@@ -205,7 +205,7 @@ int rw_verify_area(int read_write, struct file *file, loff_t *ppos, size_t count
if (unlikely((pos < 0) || (loff_t) (pos + count) < 0)) if (unlikely((pos < 0) || (loff_t) (pos + count) < 0))
goto Einval; goto Einval;
if (unlikely(inode->i_flock && MANDATORY_LOCK(inode))) { if (unlikely(inode->i_flock && mandatory_lock(inode))) {
int retval = locks_mandatory_area( int retval = locks_mandatory_area(
read_write == READ ? FLOCK_VERIFY_READ : FLOCK_VERIFY_WRITE, read_write == READ ? FLOCK_VERIFY_READ : FLOCK_VERIFY_WRITE,
inode, file, pos, count); inode, file, pos, count);
......
...@@ -883,6 +883,7 @@ extern int vfs_setlease(struct file *, long, struct file_lock **); ...@@ -883,6 +883,7 @@ extern int vfs_setlease(struct file *, long, struct file_lock **);
extern int lease_modify(struct file_lock **, int); extern int lease_modify(struct file_lock **, int);
extern int lock_may_read(struct inode *, loff_t start, unsigned long count); extern int lock_may_read(struct inode *, loff_t start, unsigned long count);
extern int lock_may_write(struct inode *, loff_t start, unsigned long count); extern int lock_may_write(struct inode *, loff_t start, unsigned long count);
extern struct seq_operations locks_seq_operations;
struct fasync_struct { struct fasync_struct {
int magic; int magic;
...@@ -1375,12 +1376,25 @@ extern int locks_mandatory_area(int, struct inode *, struct file *, loff_t, size ...@@ -1375,12 +1376,25 @@ extern int locks_mandatory_area(int, struct inode *, struct file *, loff_t, size
* Candidates for mandatory locking have the setgid bit set * Candidates for mandatory locking have the setgid bit set
* but no group execute bit - an otherwise meaningless combination. * but no group execute bit - an otherwise meaningless combination.
*/ */
#define MANDATORY_LOCK(inode) \
(IS_MANDLOCK(inode) && ((inode)->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID) static inline int __mandatory_lock(struct inode *ino)
{
return (ino->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID;
}
/*
* ... and these candidates should be on MS_MANDLOCK mounted fs,
* otherwise these will be advisory locks
*/
static inline int mandatory_lock(struct inode *ino)
{
return IS_MANDLOCK(ino) && __mandatory_lock(ino);
}
static inline int locks_verify_locked(struct inode *inode) static inline int locks_verify_locked(struct inode *inode)
{ {
if (MANDATORY_LOCK(inode)) if (mandatory_lock(inode))
return locks_mandatory_locked(inode); return locks_mandatory_locked(inode);
return 0; return 0;
} }
...@@ -1391,7 +1405,7 @@ static inline int locks_verify_truncate(struct inode *inode, ...@@ -1391,7 +1405,7 @@ static inline int locks_verify_truncate(struct inode *inode,
struct file *filp, struct file *filp,
loff_t size) loff_t size)
{ {
if (inode->i_flock && MANDATORY_LOCK(inode)) if (inode->i_flock && mandatory_lock(inode))
return locks_mandatory_area( return locks_mandatory_area(
FLOCK_VERIFY_WRITE, inode, filp, FLOCK_VERIFY_WRITE, inode, filp,
size < inode->i_size ? size : inode->i_size, size < inode->i_size ? size : inode->i_size,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment