- 25 Apr, 2017 1 commit
-
-
Amir Goldstein authored
When delivering an event to userspace for a file on an NFS share, if the file is deleted on server side before user reads the event, user will not get the event. If the event queue contained several events, the stale event is quietly dropped and read() returns to user with events read so far in the buffer. If the event queue contains a single stale event or if the stale event is a permission event, read() returns to user with the kernel internal error code 518 (EOPENSTALE), which is not a POSIX error code. Check the internal return value -EOPENSTALE in fanotify_read(), just the same as it is checked in path_openat() and drop the event in the cases that it is not already dropped. This is a reproducer from Marko Rauhamaa: Just take the example program listed under "man fanotify" ("fantest") and follow these steps: ============================================================== NFS Server NFS Client(1) NFS Client(2) ============================================================== # echo foo >/nfsshare/bar.txt # cat /nfsshare/bar.txt foo # ./fantest /nfsshare Press enter key to terminate. Listening for events. # rm -f /nfsshare/bar.txt # cat /nfsshare/bar.txt read: Unknown error 518 cat: /nfsshare/bar.txt: Operation not permitted ============================================================== where NFS Client (1) and (2) are two terminal sessions on a single NFS Client machine. Reported-by:
Marko Rauhamaa <marko.rauhamaa@f-secure.com> Tested-by:
Marko Rauhamaa <marko.rauhamaa@f-secure.com> Cc: <linux-api@vger.kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
- 24 Apr, 2017 1 commit
-
-
Dan Carpenter authored
We recently shifted this code around, so we're no longer holding the lock on this path. Fixes: 755b5bc6 ("fsnotify: Remove indirection from mark list addition") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
- 10 Apr, 2017 31 commits
-
-
Jan Kara authored
Pointer to ->free_mark callback unnecessarily occupies one long in each fsnotify_mark although they are the same for all marks from one notification group. Move the callback pointer to fsnotify_ops. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently we initialize mark->group only in fsnotify_add_mark_lock(). However we will need to access fsnotify_ops of corresponding group from fsnotify_put_mark() so we need mark->group initialized earlier. Do that in fsnotify_init_mark() which has a consequence that once fsnotify_init_mark() is called on a mark, the mark has to be destroyed by fsnotify_put_mark(). Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
inode_mark.c now contains only a single function. Move it to fs/notify/fsnotify.c and remove inode_mark.c. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
These are very thin wrappers, just remove them. Drop fs/notify/vfsmount_mark.c as it is empty now. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
The function is already mostly contained in what fsnotify_clear_marks_by_group() does. Just update that function to not select marks when all of them should be destroyed and remove fsnotify_detach_group_marks(). Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
The _flags() suffix in the function name was more confusing than explaining so just remove it. Also rename the argument from 'flags' to 'type' to better explain what the function expects. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Suggested-by:
Amir Goldstein <amir73il@gmail.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Inline these helpers as they are very thin. We still keep them as we don't want to expose details about how list type is determined. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
These helpers are just very thin wrappers now. Remove them. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
These helpers are now only a simple assignment and just obfuscate what is going on. Remove them. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
When userspace task processing fanotify permission events screws up and does not respond, fsnotify_mark_srcu SRCU is held indefinitely which causes further hangs in the whole notification subsystem. Although we cannot easily solve the problem of operations blocked waiting for response from userspace, we can at least somewhat localize the damage by dropping SRCU lock before waiting for userspace response and reacquiring it when userspace responds. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Pass fsnotify_iter_info into ->handle_event() handler so that it can release and reacquire SRCU lock via fsnotify_prepare_user_wait() and fsnotify_finish_user_wait() functions. These functions also make sure current marks are appropriately pinned so that iteration protected by srcu in fsnotify() stays safe. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
fanotify wants to drop fsnotify_mark_srcu lock when waiting for response from userspace so that the whole notification subsystem is not blocked during that time. This patch provides a framework for safely getting mark reference for a mark found in the object list which pins the mark in that list. We can then drop fsnotify_mark_srcu, wait for userspace response and then safely continue iteration of the object list once we reaquire fsnotify_mark_srcu. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently we queue all marks for destruction on group shutdown and then destroy them from fsnotify_destroy_group() instead from a worker thread which is the usual path. However worker can already be processing some list of marks to destroy so this does not make 100% all marks are really destroyed by the time group is shut down. This isn't a big problem as each mark holds group reference and thus group stays partially alive until all marks are really freed but there's no point in complicating our lives - just wait for the delayed work to be finished instead. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Instead of removing mark from object list from fsnotify_detach_mark(), remove the mark when last reference to the mark is dropped. This will allow fanotify to wait for userspace response to event without having to hold onto fsnotify_mark_srcu. To avoid pinning inodes by elevated refcount (and thus e.g. delaying file deletion) while someone holds mark reference, we detach connector from the object also from fsnotify_destroy_marks() and not only after removing last mark from the list as it was now. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently we queue mark into a list of marks for destruction in __fsnotify_free_mark() and keep the last mark reference dangling. After the worker waits for SRCU period, it drops the last reference to the mark which frees it. This scheme has the disadvantage that if we hold reference to a mark and drop and reacquire SRCU lock, the mark can get freed immediately which is slightly inconvenient and we will need to avoid this in the future. Move to a scheme where queueing of mark into a list of marks for destruction happens when the last reference to the mark is dropped. Also drop reference to the mark held by group list already when mark is removed from that list instead of dropping it only from the destruction worker. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Dropping mark reference can result in mark being freed. Although it should not happen in inotify_remove_from_idr() since caller should hold another reference, just don't risk lock up just after WARN_ON unnecessarily. Also fold do_inotify_remove_from_idr() into the single callsite as that function really is just two lines of real code. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently we free fsnotify_mark_connector structure only when inode / vfsmount is getting freed. This can however impose noticeable memory overhead when marks get attached to inodes only temporarily. So free the connector structure once the last mark is detached from the object. Since notification infrastructure can be working with the connector under the protection of fsnotify_mark_srcu, we have to be careful and free the fsnotify_mark_connector only after SRCU period passes. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
So far list of marks attached to an object (inode / vfsmount) was protected by i_lock or mnt_root->d_lock. This dictates that the list must be empty before the object can be destroyed although the list is now anchored in the fsnotify_mark_connector structure. Protect the list by a spinlock in the fsnotify_mark_connector structure to decouple lifetime of a list of marks from a lifetime of the object. This also simplifies the code quite a bit since we don't have to differentiate between inode and vfsmount lists in quite a few places anymore. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
After removing all the indirection it is clear that hlist_del_init_rcu(&mark->obj_list); in fsnotify_destroy_marks() is not needed as the mark gets removed from the list shortly afterwards in fsnotify_destroy_mark() -> fsnotify_detach_mark() -> fsnotify_detach_from_object(). Also there is no problem with mark being visible on object list while we call fsnotify_destroy_mark() as parallel destruction of marks from several places is properly handled (as mentioned in the comment in fsnotify_destroy_marks(). So just remove the list removal and also the stale comment. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
We lock object list lock in fsnotify_detach_from_object() twice - once to detach mark and second time to recalculate mask. That is unnecessary and later it will become problematic as we will free the connector as soon as there is no mark in it. So move recalculation of fsnotify mask into the same critical section that is detaching mark. This also removes recalculation of child dentry flags from fsnotify_detach_from_object(). That is however fine. Those marks will get recalculated once some event happens on a child. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
fsnotify_detach_mark() calls fsnotify_destroy_inode_mark() or fsnotify_destroy_vfsmount_mark() to remove mark from object list. These two functions are however very similar and differ only in the lock they use to protect the object list of marks. Simplify the code by removing the indirection and removing mark from the object list in a common function. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Instead of passing spinlock into fsnotify_destroy_marks() determine it directly in that function from the connector type. This will reduce code churn when changing lock protecting list of marks. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Move locking of a mark list into fsnotify_find_mark(). This reduces code churn in the following patch changing lock protecting the list. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Move locking of locks protecting a list of marks into fsnotify_recalc_mask(). This reduces code churn in the following patch which changes the lock protecting the list of marks. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Move fsnotify_destroy_marks() to be later in the fs/notify/mark.c. It will need some functions that are declared after its current declaration. No functional change. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Adding notification mark to object list has been currently done through fsnotify_add_{inode|vfsmount}_mark() helpers from fsnotify_add_mark_locked() which call fsnotify_add_mark_list(). Remove this unnecessary indirection to simplify the code. Pushing all the locking to fsnotify_add_mark_list() also allows us to allocate the connector structure with GFP_KERNEL mode. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently inode reference is held by fsnotify marks. Change the rules so that inode reference is held by fsnotify_mark_connector structure whenever the list is non-empty. This simplifies the code and is more logical. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Move pointer to inode / vfsmount from mark itself to the fsnotify_mark_connector structure. This is another step on the path towards decoupling inode / vfsmount lifetime from notification mark lifetime. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently notification marks are attached to object (inode or vfsmnt) by a hlist_head in the object. The list is also protected by a spinlock in the object. So while there is any mark attached to the list of marks, the object must be pinned in memory (and thus e.g. last iput() deleting inode cannot happen). Also for list iteration in fsnotify() to work, we must hold fsnotify_mark_srcu lock so that mark itself and mark->obj_list.next cannot get freed. Thus we are required to wait for response to fanotify events from userspace process with fsnotify_mark_srcu lock held. That causes issues when userspace process is buggy and does not reply to some event - basically the whole notification subsystem gets eventually stuck. So to be able to drop fsnotify_mark_srcu lock while waiting for response, we have to pin the mark in memory and make sure it stays in the object list (as removing the mark waiting for response could lead to lost notification events for groups later in the list). However we don't want inode reclaim to block on such mark as that would lead to system just locking up elsewhere. This commit is the first in the series that paves way towards solving these conflicting lifetime needs. Instead of anchoring the list of marks directly in the object, we anchor it in a dedicated structure (fsnotify_mark_connector) and just point to that structure from the object. The following commits will also add spinlock protecting the list and object pointer to the structure. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Add a comment that lifetime of a notification mark is protected by SRCU and remove a comment about clearing of marks attached to the inode. It is stale and more uptodate version is at fsnotify_destroy_marks() which is the function handling this case. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Currently audit code uses checking of mark->inode to verify whether mark is still alive. Switch that to checking mark flags as that is more logical and current way will become unreliable in future. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
- 05 Apr, 2017 1 commit
-
-
Jan Kara authored
Audit tree currently uses inode pointer as a key into the hash table. Getting that from notification mark will be somewhat more difficult with coming fsnotify changes. So abstract getting of hash key from the audit chunk and inode so that we can change the method to obtain a key easily. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> CC: Paul Moore <paul@paul-moore.com> Acked-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
- 03 Apr, 2017 3 commits
-
-
Jan Kara authored
Move recalculation of inode / vfsmount notification mask under group->mark_mutex of the mark which was modified. These are the only places where mask recalculation happens without mark being protected from detaching from inode / vfsmount which will cause issues with the following patches. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
Printing inode pointers in warnings has dubious value and with future changes we won't be able to easily get them without either locking or chances we oops along the way. So just remove inode pointers from the warning messages. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
Jan Kara authored
show_fdinfo() iterates group's list of marks. All marks found there are guaranteed to be alive and they stay so until we release group->mark_mutex. So remove uncecessary tests whether mark is alive. Reviewed-by:
Miklos Szeredi <mszeredi@redhat.com> Reviewed-by:
Amir Goldstein <amir73il@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz>
-
- 12 Mar, 2017 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull s390 fixes from Martin Schwidefsky: - four patches to get the new cputime code in shape for s390 - add the new statx system call - a few bug fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: wire up statx system call KVM: s390: Fix guest migration for huge guests resulting in panic s390/ipl: always use load normal for CCW-type re-IPL s390/timex: micro optimization for tod_to_ns s390/cputime: provide archicture specific cputime_to_nsecs s390/cputime: reset all accounting fields on fork s390/cputime: remove last traces of cputime_t s390: fix in-kernel program checks s390/crypt: fix missing unlock in ctr_paes_crypt on error path
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: - a fix for the kexec/purgatory regression which was introduced in the merge window via an innocent sparse fix. We could have reverted that commit, but on deeper inspection it turned out that the whole machinery is neither documented nor robust. So a proper cleanup was done instead - the fix for the TLB flush issue which was discovered recently - a simple typo fix for a reboot quirk * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tlb: Fix tlb flushing when lguest clears PGE kexec, x86/purgatory: Unbreak it and clean it up x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk
-