- 19 Dec, 2016 1 commit
-
-
Mauricio Faria de Oliveira authored
The WRITE_SAME commands are not present in the blk_default_cmd_filter write_ok list, and thus are failed with -EPERM when the SG_IO ioctl() is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users). [ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ] The problem can be reproduced with the sg_write_same command # sg_write_same --num 1 --xferlen 512 /dev/sda # # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_same --num 1 --xferlen 512 /dev/sda' Write same: pass through os error: Operation not permitted # For comparison, the WRITE_VERIFY command does not observe this problem, since it is in that list: # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda' # So, this patch adds the WRITE_SAME commands to the list, in order for the SG_IO ioctl to finish successfully: # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_same --num 1 --xferlen 512 /dev/sda' # That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices (qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]), which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu). In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls, which are translated to write-same calls in the guest kernel, and then into SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest: [...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current] [...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated [...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00 [...] blk_update_request: I/O error, dev sda, sector 17096824 Links: [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52 [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device') Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com> Reported-by: Manjunatha H R <manjuhr1@in.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 17 Dec, 2016 3 commits
-
-
Ritesh Harjani authored
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Eric Wheeler authored
Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net> Tested-by: Wido den Hollander <wido@widodh.nl>
-
Kent Overstreet authored
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
-
- 16 Dec, 2016 36 commits
-
-
git://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph updates from Ilya Dryomov: "A varied set of changes: - a large rework of cephx auth code to cope with CONFIG_VMAP_STACK (myself). Also fixed a deadlock caused by a bogus allocation on the writeback path and authorize reply verification. - a fix for long stalls during fsync (Jeff Layton). The client now has a way to force the MDS log flush, leading to ~100x speedups in some synthetic tests. - a new [no]require_active_mds mount option (Zheng Yan). On mount, we will now check whether any of the MDSes are available and bail rather than block if none are. This check can be avoided by specifying the "no" option. - a couple of MDS cap handling fixes and a few assorted patches throughout" * tag 'ceph-for-4.10-rc1' of git://github.com/ceph/ceph-client: (32 commits) libceph: remove now unused finish_request() wrapper libceph: always signal completion when done ceph: avoid creating orphan object when checking pool permission ceph: properly set issue_seq for cap release ceph: add flags parameter to send_cap_msg ceph: update cap message struct version to 10 ceph: define new argument structure for send_cap_msg ceph: move xattr initialzation before the encoding past the ceph_mds_caps ceph: fix minor typo in unsafe_request_wait ceph: record truncate size/seq for snap data writeback ceph: check availability of mds cluster on mount ceph: fix splice read for no Fc capability case ceph: try getting buffer capability for readahead/fadvise ceph: fix scheduler warning due to nested blocking ceph: fix printing wrong return variable in ceph_direct_read_write() crush: include mapper.h in mapper.c rbd: silence bogus -Wmaybe-uninitialized warning libceph: no need to drop con->mutex for ->get_authorizer() libceph: drop len argument of *verify_authorizer_reply() libceph: verify authorize reply on connect ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfsLinus Torvalds authored
Pull overlayfs updates from Miklos Szeredi: "This update contains: - try to clone on copy-up - allow renaming a directory - split source into managable chunks - misc cleanups and fixes It does not contain the read-only fd data inconsistency fix, which Al didn't like. I'll leave that to the next year..." * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (36 commits) ovl: fix reStructuredText syntax errors in documentation ovl: fix return value of ovl_fill_super ovl: clean up kstat usage ovl: fold ovl_copy_up_truncate() into ovl_copy_up() ovl: create directories inside merged parent opaque ovl: opaque cleanup ovl: show redirect_dir mount option ovl: allow setting max size of redirect ovl: allow redirect_dir to default to "on" ovl: check for emptiness of redirect dir ovl: redirect on rename-dir ovl: lookup redirects ovl: consolidate lookup for underlying layers ovl: fix nested overlayfs mount ovl: check namelen ovl: split super.c ovl: use d_is_dir() ovl: simplify lookup ovl: check lower existence of rename target ovl: rename: simplify handling of lower/merged directory ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs updates from Chris Mason: "Jeff Mahoney and Dave Sterba have a really nice set of cleanups in here, and Christoph pitched in corrections/improvements to make btrfs use proper helpers for bio walking instead of doing it by hand. There are some key fixes as well, including some long standing bugs that took forever to track down in btrfs_drop_extents and during balance" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (77 commits) btrfs: limit async_work allocation and worker func duration Revert "Btrfs: adjust len of writes if following a preallocated extent" Btrfs: don't WARN() in btrfs_transaction_abort() for IO errors btrfs: opencode chunk locking, remove helpers btrfs: remove root parameter from transaction commit/end routines btrfs: split btrfs_wait_marked_extents into normal and tree log functions btrfs: take an fs_info directly when the root is not used otherwise btrfs: simplify btrfs_wait_cache_io prototype btrfs: convert extent-tree tracepoints to use fs_info btrfs: root->fs_info cleanup, access fs_info->delayed_root directly btrfs: root->fs_info cleanup, add fs_info convenience variables btrfs: root->fs_info cleanup, update_block_group{,flags} btrfs: root->fs_info cleanup, lock/unlock_chunks btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_size btrfs: pull node/sector/stripe sizes out of root and into fs_info btrfs: root->fs_info cleanup, io_ctl_init btrfs: root->fs_info cleanup, use fs_info->dev_root everywhere btrfs: struct reada_control.root -> reada_control.fs_info btrfs: struct btrfsic_state->root should be an fs_info btrfs: alloc_reserved_file_extent trace point should use extent_root ...
-
git://linux-nfs.org/~bfields/linuxLinus Torvalds authored
Pull nfsd updates from Bruce Fields: "The one new feature is support for a new NFSv4.2 mode_umask attribute that makes ACL inheritance a little more useful in environments that default to restrictive umasks. Requires client-side support, also on its way for 4.10. Other than that, miscellaneous smaller fixes and cleanup, especially to the server rdma code" [ The client side of the umask attribute was merged yesterday ] * tag 'nfsd-4.10' of git://linux-nfs.org/~bfields/linux: nfsd: add support for the umask attribute sunrpc: use DEFINE_SPINLOCK() svcrdma: Further clean-up of svc_rdma_get_inv_rkey() svcrdma: Break up dprintk format in svc_rdma_accept() svcrdma: Remove unused variable in rdma_copy_tail() svcrdma: Remove unused variables in xprt_rdma_bc_allocate() svcrdma: Remove svc_rdma_op_ctxt::wc_status svcrdma: Remove DMA map accounting svcrdma: Remove BH-disabled spin locking in svc_rdma_send() svcrdma: Renovate sendto chunk list parsing svcauth_gss: Close connection when dropping an incoming message svcrdma: Clear xpt_bc_xps in xprt_setup_rdma_bc() error exit arm nfsd: constify reply_cache_stats_operations structure nfsd: update workqueue creation sunrpc: GFP_KERNEL should be GFP_NOFS in crypto code nfsd: catch errors in decode_fattr earlier nfsd: clean up supported attribute handling nfsd: fix error handling for clients that fail to return the layout nfsd: more robust allocation failure handling in nfsd_reply_cache_init
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs updates from Al Viro: - more ->d_init() stuff (work.dcache) - pathname resolution cleanups (work.namei) - a few missing iov_iter primitives - copy_from_iter_full() and friends. Either copy the full requested amount, advance the iterator and return true, or fail, return false and do _not_ advance the iterator. Quite a few open-coded callers converted (and became more readable and harder to fuck up that way) (work.iov_iter) - several assorted patches, the big one being logfs removal * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: logfs: remove from tree vfs: fix put_compat_statfs64() does not handle errors namei: fold should_follow_link() with the step into not-followed link namei: pass both WALK_GET and WALK_MORE to should_follow_link() namei: invert WALK_PUT logics namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link() namei: saner calling conventions for mountpoint_last() namei.c: get rid of user_path_parent() switch getfrag callbacks to ..._full() primitives make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success [iov_iter] new primitives - copy_from_iter_full() and friends don't open-code file_inode() ceph: switch to use of ->d_init() ceph: unify dentry_operations instances lustre: switch to use of ->d_init()
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-mediaLinus Torvalds authored
Pull media updates from Mauro Carvalho Chehab: - new Mediatek drivers: mtk-mdp and mtk-vcodec - some additions at the media documentation - the CEC core and drivers were promoted from staging to mainstream - some cleanups at the DVB core - the LIRC serial driver got promoted from staging to mainstream - added a driver for Renesas R-Car FDP1 driver - add DVBv5 statistics support to mn88473 driver - several fixes related to printk continuation lines - add support for HSV encoding formats - lots of other cleanups, fixups and driver improvements. * tag 'media/v4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (496 commits) [media] v4l: tvp5150: Add missing break in set control handler [media] v4l: tvp5150: Don't inline the tvp5150_selmux() function [media] v4l: tvp5150: Compile tvp5150_link_setup out if !CONFIG_MEDIA_CONTROLLER [media] em28xx: don't store usb_device at struct em28xx [media] em28xx: use usb_interface for dev_foo() calls [media] em28xx: don't change the device's name [media] mn88472: fix chip id check on probe [media] mn88473: fix chip id check on probe [media] lirc: fix error paths in lirc_cdev_add() [media] s5p-mfc: Add support for MFC v8 available in Exynos 5433 SoCs [media] s5p-mfc: Rework clock handling [media] s5p-mfc: Don't keep clock prepared all the time [media] s5p-mfc: Kill all IS_ERR_OR_NULL in clocks management code [media] s5p-mfc: Remove dead conditional code [media] s5p-mfc: Ensure that clock is disabled before turning power off [media] s5p-mfc: Remove special clock rate management [media] s5p-mfc: Use printk_ratelimited for reporting ioctl errors [media] s5p-mfc: Set DMA_ATTR_ALLOC_SINGLE_PAGES [media] vivid: Set color_enc on HSV formats [media] v4l2-tpg: Init hv_enc field with a valid value ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edacLinus Torvalds authored
Pull edac updates from Mauro Carvalho Chehab: "This contains the conversion of the EDAC uAPI documentation to ReST and the addition of the EDAC kAPI documentation to the driver-api docs. It also splits the EDAC headers by their functions" * tag 'edac/v4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: EDAC: Document HW_EVENT_ERR_DEFERRED type edac.rst: move concepts dictionary from edac.h edac: fix kenel-doc markups at edac.h edac: fix kernel-doc tags at the drivers/edac_*.h edac: adjust docs location at MAINTAINERS and 00-INDEX driver-api: create an edac.rst file with EDAC documentation edac: move documentation from edac_mc.c to edac_core.h edac: move documentation from edac_pci*.c to edac_pci.h edac: move documentation from edac_device to edac_core.h edac: rename edac_core.h to edac_mc.h edac: move EDAC device definitions to drivers/edac/edac_device.h edac: move EDAC PCI definitions to drivers/edac/edac_pci.h docs-rst: admin-guide: add documentation for EDAC edac.txt: Improve documentation, adding RAS introduction edac.txt: update information about newer Intel CPUs edac.txt: remove info that the Nehalem EDAC is experimental edac.txt: convert EDAC documentation to ReST edac.txt: add a section explaining the dimmX and rankX directories edac: edac_core.h: remove prototype for edac_pci_reset_delay_period() edac: edac_core.h: get rid of unused kobj_complete
-
git://git.kernel.org/pub/scm/linux/kernel/git/rw/umlLinus Torvalds authored
Pull UML update from Richard Weinberger: "A performance enhancement for UML's block driver" * 'for-linus-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: UBD Improvements
-
git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2Linus Torvalds authored
Pull arch/nios2 updates from Ley Foon Tan: - add screen_info - Convert pfn_valid to static inline - Extend !__ASSEMBLY__ section in asm/page.h * tag 'nios2-v4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2: nios2: add screen_info nios2: Convert pfn_valid to static inline nios2: Extend !__ASSEMBLY__ section in asm/page.h
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for the kexec_file_load() syscall, which is a prereq for secure and trusted boot. - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN). - Sort the exception tables at build time, to save time at boot, and store them as relative offsets to save space in the kernel image & memory. - Allow building the kernel with thin archives, which should allow us to build an allyesconfig once some other fixes land. - Build fixes to allow us to correctly rebuild when changing the kernel endian from big to little or vice versa. - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix. - Initial stack protector support (-fstack-protector). - Support for dumping the radix (aka. Linux) and hash page tables via debugfs. - Fix an oops in cxl coredump generation when cxl_get_fd() is used. - Freescale updates from Scott: "Highlights include 8xx hugepage support, qbman fixes/cleanup, device tree updates, and some misc cleanup." - Many and varied fixes and minor enhancements as always. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain" [ And thanks to Michael, who took time off from a new baby to get this pull request done. - Linus ] * tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits) powerpc/fsl/dts: add FMan node for t1042d4rdb powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb powerpc/fsl/dts: add QMan and BMan nodes on t1024 powerpc/fsl/dts: add QMan and BMan nodes on t1023 soc/fsl/qman: test: use DEFINE_SPINLOCK() powerpc/fsl-lbc: use DEFINE_SPINLOCK() powerpc/8xx: Implement support of hugepages powerpc: get hugetlbpage handling more generic powerpc: port 64 bits pgtable_cache to 32 bits powerpc/boot: Request no dynamic linker for boot wrapper soc/fsl/bman: Use resource_size instead of computation soc/fsl/qe: use builtin_platform_driver powerpc/fsl_pmc: use builtin_platform_driver powerpc/83xx/suspend: use builtin_platform_driver powerpc/ftrace: Fix the comments for ftrace_modify_code powerpc/perf: macros for power9 format encoding powerpc/perf: power9 raw event format encoding powerpc/perf: update attribute_group data structure powerpc/perf: factor out the event format field powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull m ore s390 updates from Martin Schwidefsky: "Over 95% of the changes in this pull request are related to the zcrypt driver. There are five improvements for zcrypt: the ID for the CEX6 cards is added, workload balancing and multi-domain support are introduced, the debug logs are overhauled and a set of tracepoints is added. Then there are several patches in regard to inline assemblies. One compile fix and several missing memory clobbers. As far as we can tell the omitted memory clobbers have not caused any breakage. A small change to the PCI arch code, the machine can tells us how big the function measurement blocks are. The PCI function measurement will be disabled for a device if the queried length is larger than the allocated size for these blocks. And two more patches to correct five printk messages. That is it for s390 in regard to the 4.10 merge window. Happy holidays" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits) s390/pci: query fmb length s390/zcrypt: add missing memory clobber to ap_qci inline assembly s390/extmem: add missing memory clobber to dcss_set_subcodes s390/nmi: fix inline assembly constraints s390/lib: add missing memory barriers to string inline assemblies s390/cpumf: fix qsi inline assembly s390/setup: reword printk messages s390/dasd: fix typos in DASD error messages s390: fix compile error with memmove_early() inline assembly s390/zcrypt: tracepoint definitions for zcrypt device driver. s390/zcrypt: Rework debug feature invocations. s390/zcrypt: Improved invalid domain response handling. s390/zcrypt: Fix ap_max_domain_id for older machine types s390/zcrypt: Correct function bits for CEX2x and CEX3x cards. s390/zcrypt: Fixed attrition of AP adapters and domains s390/zcrypt: Introduce new zcrypt device status API s390/zcrypt: add multi domain support s390/zcrypt: Introduce workload balancing s390/zcrypt: get rid of ap_poll_requests s390/zcrypt: header for the AP inline assmblies ...
-
Amir Goldstein authored
- Fix broken long line block quote - Fix missing newline before bullets list - Use correct numbered list syntax Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Geliang Tang authored
If kcalloc() failed, the return value of ovl_fill_super() is -EINVAL, not -ENOMEM. So this patch sets this value to -ENOMEM before calling kcalloc(), and sets it back to -EINVAL after calling kcalloc(). Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Al Viro authored
FWIW, there's a bit of abuse of struct kstat in overlayfs object creation paths - for one thing, it ends up with a very small subset of struct kstat (mode + rdev), for another it also needs link in case of symlinks and ends up passing it separately. IMO it would be better to introduce a separate object for that. In principle, we might even lift that thing into general API and switch ->mkdir()/->mknod()/->symlink() to identical calling conventions. Hell knows, perhaps ->create() as well... Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Amir Goldstein authored
This removes code duplication. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Amir Goldstein authored
The benefit of making directories opaque on creation is that lookups can stop short when they reach the original created directory, instead of continue lookup the entire depth of parent directory stack. The best case is overlay with N layers, performing lookup for first level directory, which exists only in upper. In that case, there will be only one lookup instead of N. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
oe->opaque is set for a) whiteouts b) directories having the "trusted.overlay.opaque" xattr Case b can be simplified, since setting the xattr always implies setting oe->opaque. Also once set, the opaque flag is never cleared. Don't need to set opaque flag for non-directories. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Amir Goldstein authored
Show the value of redirect_dir in /proc/mounts. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Add a module option to allow tuning the max size of absolute redirects. Default is 256. Size of relative redirects is naturally limited by the the underlying filesystem's max filename length (usually 255). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
This patch introduces a kernel config option and a module param. Both can be used independently to turn the default value of redirect_dir on or off. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Amir Goldstein authored
Before introducing redirect_dir feature, the condition !ovl_lower_positive(dentry) for a directory, implied that it is a pure upper directory, which may be removed if empty. Now that directory can be redirect, it is possible that upper does not cover any lower (i.e. !ovl_lower_positive(dentry)), but the directory is a merge (with redirected path) and maybe non empty. Check for this case in ovl_remove_upper(). This change fixes the following test case from rename-pop-dir.py of unionmount-testsuite: """Remove dir and rename old name""" d = ctx.non_empty_dir() d2 = ctx.no_dir() ctx.rmdir(d, err=ENOTEMPTY) ctx.rename(d, d2) ctx.rmdir(d, err=ENOENT) ctx.rmdir(d2, err=ENOTEMPTY) ./run --ov rename-pop-dir /mnt/a/no_dir103: Expected error (Directory not empty) was not produced Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Current code returns EXDEV when a directory would need to be copied up to move. We could copy up the directory tree in this case, but there's another, simpler solution: point to old lower directory from moved upper directory. This is achieved with a "trusted.overlay.redirect" xattr storing the path relative to the root of the overlay. After such attribute has been set, the directory can be moved without further actions required. This is a backward incompatible feature, old kernels won't be able to correctly mount an overlay containing redirected directories. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
If a directory has the "trusted.overlay.redirect" xattr, it means that the value of the xattr should be used to find the underlying directory on the next lower layer. The redirect may be relative or absolute. Absolute redirects begin with a slash. A relative redirect means: instead of the current dentry's name use the value of the redirect to find the directory in the next lower layer. Relative redirects must not contain a slash. An absolute redirect means: look up the directory relative to the root of the overlay using the value of the redirect in the next lower layer. Redirects work on lower layers as well. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Use a common helper for lookup of upper and lower layers. This paves the way for looking up directory redirects. No functional change. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Amir Goldstein authored
When the upper overlayfs checks "trusted.overlay.*" xattr on the underlying overlayfs mount, it gets -EPERM, which confuses the upper overlayfs. Fix this by returning -EOPNOTSUPP instead of -EPERM from ovl_own_xattr_get() and ovl_own_xattr_set(). This behavior is consistent with the behavior of ovl_listxattr(), which filters out the private overlayfs xattrs. Note: nested overlays are deprecated. But this change makes sense regardless: these xattrs are private to the overlay and should always be hidden. Hence getting and setting them should indicate this. [SzMi: Use EOPNOTSUPP instead of ENODATA and use it for both getting and setting "trusted.overlay." xattrs. This is a perfectly valid error code for "we don't support this prefix", which is the case here.] Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
We already calculate f_namelen in statfs as the maximum of the name lengths provided by the filesystems taking part in the overlay. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
fs/overlayfs/super.c is the biggest of the overlayfs source files and it contains various utility functions as well as the rather complicated lookup code. Split these parts out to separate files. Before: 1446 fs/overlayfs/super.c After: 919 fs/overlayfs/super.c 267 fs/overlayfs/namei.c 235 fs/overlayfs/util.c 51 fs/overlayfs/ovl_entry.h Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
If encountering a non-directory, then stop looking at lower layers. In this case the oe->opaque flag is not set anymore, which doesn't matter since existence of lower file is now checked at remove/rename time. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Check if something exists on the lower layer(s) under the target or rename to decide if directory needs to be marked "opaque". Marking opaque is done before the rename, and on failure the marking was undone. Also the opaque xattr was removed if the target didn't cover anything. This patch changes behavior so that removal of "opaque" is not done in either of the above cases. This means that directory may have the opaque flag even if it doesn't cover anything. However this shouldn't affect the performance or semantics of the overalay, while simplifying the code. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
d_is_dir() is safe to call on a negative dentry. Use this fact to simplify handling of the lower or merged directories. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
The remainging uses of __OVL_PATH_PURE can be replaced by ovl_dentry_is_opaque(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Currently ovl_lookup() checks existence of lower file even if there's a non-directory on upper (which is always opaque). This is done so that remove can decide whether a whiteout is needed or not. It would be better to defer this check to unlink, since most of the time the gathered information about opaqueness will be unused. This adds a helper ovl_lower_positive() that checks if there's anything on the lower layer(s). The following patches also introduce changes to how the "opaque" attribute is updated on directories: this attribute is added when the directory is creted or moved over a whiteout or object covering something on the lower layer. However following changes will allow the attribute to remain on the directory after being moved, even if the new location doesn't cover anything. Because of this, we need to check lower layers even for opaque directories, so that whiteout is only created when necessary. This function will later be also used to decide about marking a directory opaque, so deal with negative dentries as well. When dealing with negative, it's enough to check for being a whiteout If the dentry is positive but not upper then it also obviously needs whiteout/opaque. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
And use it instead of ovl_dentry_is_opaque() where appropriate. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
Since commit 07a2daab ("ovl: Copy up underlying inode's ->i_mode to overlay inode") sticky checking on overlay inode is performed by the vfs, so checking against sticky on underlying inode is not needed. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-
Miklos Szeredi authored
This is redundant, the vfs already performed this check (and was broken, see commit 9409e22a ("vfs: rename: check backing inode being equal")). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
-