- 27 Apr, 2016 29 commits
-
-
Christoph Hellwig authored
commit 294ab783 upstream. It looks like the Kconfig check that was meant to fix this (commit fe9233fb [SCSI] scsi_dh: fix kconfig related build errors) was actually reversed, but no-one noticed until the new set of patches which separated DM and SCSI_DH). Fixes: fe9233fbSigned-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Hin-Tak Leung authored
commit b4cc0efe upstream. Fix B-tree corruption when a new record is inserted at position 0 in the node in hfs_brec_insert(). This is an identical change to the corresponding hfs b-tree code to Sergei Antonov's "hfsplus: fix B-tree corruption after insertion at position 0", to keep similar code paths in the hfs and hfsplus drivers in sync, where appropriate. Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Sergei Antonov <saproj@gmail.com> Cc: Joe Perches <joe@perches.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Anton Altaparmakov <anton@tuxera.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Hin-Tak Leung authored
commit 7cb74be6 upstream. Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and hfs_bnode_find() for finding or creating pages corresponding to an inode) are immediately kmap()'ed and used (both read and write) and kunmap()'ed, and should not be page_cache_release()'ed until hfs_bnode_free(). This patch fixes a problem I first saw in July 2012: merely running "du" on a large hfsplus-mounted directory a few times on a reasonably loaded system would get the hfsplus driver all confused and complaining about B-tree inconsistencies, and generates a "BUG: Bad page state". Most recently, I can generate this problem on up-to-date Fedora 22 with shipped kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller mounts) and "du /mnt" simultaneously on two windows, where /mnt is a lightly-used QEMU VM image of the full Mac OS X 10.9: $ df -i / /home /mnt Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/fedora-root 3276800 551665 2725135 17% / /dev/mapper/fedora-home 52879360 716221 52163139 2% /home /dev/nbd0p2 4294967295 1387818 4293579477 1% /mnt After applying the patch, I was able to run "du /" (60+ times) and "du /mnt" (150+ times) continuously and simultaneously for 6+ hours. There are many reports of the hfsplus driver getting confused under load and generating "BUG: Bad page state" or other similar issues over the years. [1] The unpatched code [2] has always been wrong since it entered the kernel tree. The only reason why it gets away with it is that the kmap/memcpy/kunmap follow very quickly after the page_cache_release() so the kernel has not had a chance to reuse the memory for something else, most of the time. The current RW driver appears to have followed the design and development of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec 2001) had a B-tree node-centric approach to read_cache_page()/page_cache_release() per bnode_get()/bnode_put(), migrating towards version 0.2 (June 2002) of caching and releasing pages per inode extents. When the current RW code first entered the kernel [2] in 2005, there was an REF_PAGES conditional (and "//" commented out code) to switch between B-node centric paging to inode-centric paging. There was a mistake with the direction of one of the REF_PAGES conditionals in __hfs_bnode_create(). In a subsequent "remove debug code" commit [4], the read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were removed, but a page_cache_release() was mistakenly left in (propagating the "REF_PAGES <-> !REF_PAGE" mistake), and the commented-out page_cache_release() in bnode_release() (which should be spanned by !REF_PAGES) was never enabled. References: [1]: Michael Fox, Apr 2013 http://www.spinics.net/lists/linux-fsdevel/msg63807.html ("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'") Sasha Levin, Feb 2015 http://lkml.org/lkml/2015/2/20/85 ("use after free") https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887 https://bugzilla.kernel.org/show_bug.cgi?id=42342 https://bugzilla.kernel.org/show_bug.cgi?id=63841 https://bugzilla.kernel.org/show_bug.cgi?id=78761 [2]: http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\ fs/hfs/bnode.c?id=d1081202 commit d1081202 Author: Andrew Morton <akpm@osdl.org> Date: Wed Feb 25 16:17:36 2004 -0800 [PATCH] HFS rewrite http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\ fs/hfsplus/bnode.c?id=91556682 commit 91556682 Author: Andrew Morton <akpm@osdl.org> Date: Wed Feb 25 16:17:48 2004 -0800 [PATCH] HFS+ support [3]: http://sourceforge.net/projects/linux-hfsplus/ http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/ http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/ http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\ fs/hfsplus/bnode.c?r1=1.4&r2=1.5 Date: Thu Jun 6 09:45:14 2002 +0000 Use buffer cache instead of page cache in bnode.c. Cache inode extents. [4]: http://git.kernel.org/cgit/linux/kernel/git/\ stable/linux-stable.git/commit/?id=a5e3985f commit a5e3985f Author: Roman Zippel <zippel@linux-m68k.org> Date: Tue Sep 6 15:18:47 2005 -0700 [PATCH] hfs: remove debug code Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Sergei Antonov <saproj@gmail.com> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Sougata Santra <sougata@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Kees Cook authored
commit a068acf2 upstream. Many file systems that implement the show_options hook fail to correctly escape their output which could lead to unescaped characters (e.g. new lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This could lead to confusion, spoofed entries (resulting in things like systemd issuing false d-bus "mount" notifications), and who knows what else. This looks like it would only be the root user stepping on themselves, but it's possible weird things could happen in containers or in other situations with delegated mount privileges. Here's an example using overlay with setuid fusermount trusting the contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use of "sudo" is something more sneaky: $ BASE="ovl" $ MNT="$BASE/mnt" $ LOW="$BASE/lower" $ UP="$BASE/upper" $ WORK="$BASE/work/ 0 0 none /proc fuse.pwn user_id=1000" $ mkdir -p "$LOW" "$UP" "$WORK" $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt $ cat /proc/mounts none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0 none /proc fuse.pwn user_id=1000 0 0 $ fusermount -u /proc $ cat /proc/mounts cat: /proc/mounts: No such file or directory This fixes the problem by adding new seq_show_option and seq_show_option_n helpers, and updating the vulnerable show_option handlers to use them as needed. Some, like SELinux, need to be open coded due to unusual existing escape mechanisms. [akpm@linux-foundation.org: add lost chunk, per Kees] [keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [lizf: Backported to 3.4: - adjust context - one more place in ceph needs to be changed - drop changes to overlayfs - drop showing vers in cifs] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Andrey Ryabinin authored
commit 71c6da84 upstream. Currently context size (cra_ctxsize) doesn't specified for ghash_async_alg. Which means it's zero. Thus crypto_create_tfm() doesn't allocate needed space for ghash_async_ctx, so any read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid. Signed-off-by: Andrey Ryabinin <aryabinin@odin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Mikulas Patocka authored
commit f49a26e7 upstream. Update ctime and mtime when a directory is modified. (though OS/2 doesn't update them anyway) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Christoph Hellwig authored
commit b632ffa7 upstream. We have many WR opcodes that are only supported in kernel space and/or require optional information to be copied into the WR structure. Reject all those not explicitly handled so that we can't pass invalid information to drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Jeffery Miller authored
commit 09bfda10 upstream. With the radeon driver loaded the HP Compaq dc5750 Small Form Factor machine fails to resume from suspend. Adding a quirk similar to other devices avoids the problem and the system resumes properly. Signed-off-by: Jeffery Miller <jmiller@neverware.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Yishai Hadas authored
commit 35d4a0b6 upstream. Fixes: 2a72f212 ("IB/uverbs: Remove dev_table") Before this commit there was a device look-up table that was protected by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When it was dropped and container_of was used instead, it enabled the race with remove_one as dev might be freed just after: dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but before the kref_get. In addition, this buggy patch added some dead code as container_of(x,y,z) can never be NULL and so dev can never be NULL. As a result the comment above ib_uverbs_open saying "the open method will either immediately run -ENXIO" is wrong as it can never happen. The solution follows Jason Gunthorpe suggestion from below URL: https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html cdev will hold a kref on the parent (the containing structure, ib_uverbs_device) and only when that kref is released it is guaranteed that open will never be called again. In addition, fixes the active count scheme to use an atomic not a kref to prevent WARN_ON as pointed by above comment from Jason. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Noa Osherovich authored
commit 5e99b139 upstream. The mlx4 IB driver implementation for ib_query_ah used a wrong offset (28 instead of 29) when link type is Ethernet. Fixed to use the correct one. Fixes: fa417f7b ('IB/mlx4: Add support for IBoE') Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Trond Myklebust authored
commit 0c78789e upstream. In case the reconnection attempt fails. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [lizf: Backported to 3.4: add definition of variable xprt] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Grant Likely authored
commit 7f5dcaf1 upstream. The unregister path of platform_device is broken. On registration, it will register all resources with either a parent already set, or type==IORESOURCE_{IO,MEM}. However, on unregister it will release everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There are also cases where resources don't get registered in the first place, like with devices created by of_platform_populate()*. Fix the unregister path to be symmetrical with the register path by checking the parent pointer instead of the type field to decide which resources to unregister. This is safe because the upshot of the registration path algorithm is that registered resources have a parent pointer, and non-registered resources do not. * It can be argued that of_platform_populate() should be registering it's resources, and they argument has some merit. However, there are quite a few platforms that end up broken if we try to do that due to overlapping resources in the device tree. Until that is fixed, we need to solve the immediate problem. Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Rob Herring <robh@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> Tested-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
David Daney authored
commit 3a496b00 upstream. If the internal call to of_address_to_resource() fails, we end up looping forever in of_find_matching_node_by_address(). This can be caused by a defective device tree, or calling with an incorrect matches argument. Fix by calling of_find_matching_node() unconditionally at the end of the loop. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Stephen Chandler Paul authored
commit 924f92bf upstream. Most of the time this isn't an issue since hotplugging an adaptor will trigger a crtc mode change which in turn, causes the driver to probe every DisplayPort for a dpcd. However, in cases where hotplugging doesn't cause a mode change (specifically when one unplugs a monitor from a DisplayPort connector, then plugs that same monitor back in seconds later on the same port without any other monitors connected), we never probe for the dpcd before starting the initial link training. What happens from there looks like this: - GPU has only one monitor connected. It's connected via DisplayPort, and does not go through an adaptor of any sort. - User unplugs DisplayPort connector from GPU. - Change in HPD is detected by the driver, we probe every DisplayPort for a possible connection. - Probe the port the user originally had the monitor connected on for it's dpcd. This fails, and we clear the first (and only the first) byte of the dpcd to indicate we no longer have a dpcd for this port. - User plugs the previously disconnected monitor back into the same DisplayPort. - radeon_connector_hotplug() is called before everyone else, and tries to handle the link training. Since only the first byte of the dpcd is zeroed, the driver is able to complete link training but does so against the wrong dpcd, causing it to initialize the link with the wrong settings. - Display stays blank (usually), dpcd is probed after the initial link training, and the driver prints no obvious messages to the log. In theory, since only one byte of the dpcd is chopped off (specifically, the byte that contains the revision information for DisplayPort), it's not entirely impossible that this bug may not show on certain monitors. For instance, the only reason this bug was visible on my ASUS PB238 monitor was due to the fact that this monitor using the enhanced framing symbol sequence, the flag for which is ignored if the radeon driver thinks that the DisplayPort version is below 1.1. Signed-off-by: Stephen Chandler Paul <cpaul@redhat.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Tyler Hicks authored
commit 5556e7e6 upstream. Consider eCryptfs dcache entries to be stale when the corresponding lower inode's i_nlink count is zero. This solves a problem caused by the lower inode being directly modified, without going through the eCryptfs mount, leaving stale eCryptfs dentries cached and the eCryptfs inode's i_nlink count not being cleared. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Reported-by: Richard Weinberger <richard@nod.at> [bwh: Backported to 3.2: - Test d_revalidate pointer directly rather than a DCACHE_OP flag - Open-code d_inode() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Matthijs Kooijman authored
commit 1fb8dc36 upstream. CustomWare uses the FTDI VID with custom PIDs for their ShipModul MiniPlex products. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Peter Chen authored
commit 0521cfd0 upstream. The ehci platform device's drvdata is the pointer of struct usb_hcd already, so we doesn't need to call bus_to_hcd conversion again. Signed-off-by: Peter Chen <peter.chen@freescale.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
NeilBrown authored
commit efcbc04e upstream. It is unusual to combine the open flags O_RDONLY and O_EXCL, but it appears that libre-office does just that. [pid 3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0 [pid 3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...> NFSv4 takes O_EXCL as a sign that a setattr command should be sent, probably to reset the timestamps. When it was an O_RDONLY open, the SETATTR command does not identify any actual attributes to change. If no delegation was provided to the open, the SETATTR uses the all-zeros stateid and the request is accepted (at least by the Linux NFS server - no harm, no foul). If a read-delegation was provided, this is used in the SETATTR request, and a Netapp filer will justifiably claim NFS4ERR_BAD_STATEID, which the Linux client takes as a sign to retry - indefinitely. So only treat O_EXCL specially if O_CREAT was also given. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Paul Bolle authored
commit fe2b5921 upstream. wf_unregister_client() increments the client count when a client unregisters. That is obviously incorrect. Decrement that client count instead. Fixes: 75722d39 ("[PATCH] ppc64: Thermal control for SMU based machines") Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Masahiro Yamada authored
commit 64526370 upstream. Currently, devres_get() passes devres_free() the pointer to devres, but devres_free() should be given with the pointer to resource data. Fixes: 9ac7849e ("devres: device resource management") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Sudip Mukherjee authored
commit bab383de upstream. parport_find_base() will implicitly do parport_get_port() which increases the refcount. Then parport_register_device() will again increment the refcount. But while unloading the module we are only doing parport_unregister_device() decrementing the refcount only once. We add an parport_put_port() to neutralize the effect of parport_get_port(). Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
NeilBrown authored
commit 199dc6ed upstream. When a (e.g.) RAID5 array is reshaped to RAID0, the updating of queue parameters (e.g. max number of sectors per bio) is done in the wrong place. It should be part of ->run, but it is actually part of ->takeover. This means it happens before level_store() calls: blk_set_stacking_limits(&mddev->queue->limits); and so it ineffective. This can lead to errors from underlying devices. So move all the relevant settings out of create_stripe_zones() and into raid0_run(). As this can lead to a bug-on it is suitable for any -stable kernel which supports reshape to RAID0. So 2.6.35 or later. As the bug has been present for five years there is no urgency, so no need to rush into -stable. Fixes: 9af204cf ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> [lizf: Backported to 3.4: - adjust context - remove changes to discard and write-same features] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Chuck Lever authored
commit 9d11b51c upstream. The Linux NFS server returns garbage in the data payload of inline NFS/RDMA READ replies. These are READs of under 1000 bytes or so where the client has not provided either a reply chunk or a write list. The NFS server delivers the data payload for an NFS READ reply to the transport in an xdr_buf page list. If the NFS client did not provide a reply chunk or a write list, send_reply() is supposed to set up a separate sge for the page containing the READ data, and another sge for XDR padding if needed, then post all of the sges via a single SEND Work Request. The problem is send_reply() does not advance through the xdr_buf when setting up scatter/gather entries for SEND WR. It always calls dma_map_xdr with xdr_off set to zero. When there's more than one sge, dma_map_xdr() sets up the SEND sge's so they all point to the xdr_buf's head. The current Linux NFS/RDMA client always provides a reply chunk or a write list when performing an NFS READ over RDMA. Therefore, it does not exercise this particular case. The Linux server has never had to use more than one extra sge for building RPC/RDMA replies with a Linux client. However, an NFS/RDMA client _is_ allowed to send small NFS READs without setting up a write list or reply chunk. The NFS READ reply fits entirely within the inline reply buffer in this case. This is perhaps a more efficient way of performing NFS READs that the Linux NFS/RDMA client may some day adopt. Fixes: b432e6b3 ('svcrdma: Change DMA mapping logic to . . .') BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=285Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Thomas Huth authored
commit 1c2cb594 upstream. The EPOW interrupt handler uses rtas_get_sensor(), which in turn uses rtas_busy_delay() to wait for RTAS becoming ready in case it is necessary. But rtas_busy_delay() is annotated with might_sleep() and thus may not be used by interrupts handlers like the EPOW handler! This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6 Call Trace: [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable) [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180 Fix this issue by introducing a new rtas_get_sensor_fast() function that does not use rtas_busy_delay() - and thus can only be used for sensors that do not cause a BUSY condition - known as "fast" sensors. The EPOW sensor is defined to be "fast" in sPAPR - mpe. Fixes: 587f83e8 ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Mark Rustad authored
commit 7aa6ca4d upstream. Set the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel Ethernet device functions other than function 0, so that on multi-function devices, we will always read VPD from function 0 instead of from the other functions. [bhelgaas: changelog] Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> [bwh: Backported to 3.2: - Put the class check in the new function as there is no DECLARE_PCI_FIXUP_CLASS_EARLY( - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Mark Rustad authored
commit 932c435c upstream. Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through function 0 to provide VPD access on other functions. This is for hardware devices that provide copies of the same VPD capability registers in multiple functions. Because the kernel expects that each function has its own registers, both the locking and the state tracking are affected by VPD accesses to different functions. On such devices for example, if a VPD write is performed on function 0, *any* later attempt to read VPD from any other function of that device will hang. This has to do with how the kernel tracks the expected value of the F bit per function. Concurrent accesses to different functions of the same device can not only hang but also corrupt both read and write VPD data. When hangs occur, typically the error message: vpd r/w failed. This is likely a firmware bug on this device. will be seen. Never set this bit on function 0 or there will be an infinite recursion. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Bob Copeland authored
commit 3633ebeb upstream. We already set a station to be associated when peering completes, both in user space and in the kernel. Thus we should always have an associated sta before sending data frames to that station. Failure to check assoc state can cause crashes in the lower-level driver due to transmitting unicast data frames before driver sta structures (e.g. ampdu state in ath9k) are initialized. This occurred when forwarding in the presence of fixed mesh paths: frames were transmitted to stations with whom we hadn't yet completed peering. Reported-by: Alexis Green <agreen@cococorp.com> Tested-by: Jesse Jones <jjones@cococorp.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Bjorn Helgaas authored
commit d1541dc9 upstream. In fixup_ti816x_class(), we assigned "class = PCI_CLASS_MULTIMEDIA_VIDEO". But PCI_CLASS_MULTIMEDIA_VIDEO is only the two-byte base class/sub-class and needs to be shifted to make space for the low-order interface byte. Shift PCI_CLASS_MULTIMEDIA_VIDEO to set the correct class code. Fixes: 63c44080 ("PCI: Add quirk for setting valid class for TI816X Endpoint") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Hemant Pedanekar <hemantp@ti.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
David Härdeman authored
commit a66b0c41 upstream. The input_dev is already gone when the rc device is being unregistered so checking for its presence only means that no remove uevent will be generated. Signed-off-by: David Härdeman <david@hardeman.nu> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
- 21 Mar, 2016 11 commits
-
-
Zefan Li authored
-
Marcelo Tosatti authored
commit 7cae2bed upstream. As reported at https://bugs.launchpad.net/qemu/+bug/1494350, it is possible to have vcpu->arch.st.last_steal initialized from a thread other than vcpu thread, say the iothread, via KVM_SET_MSRS. Which can cause an overflow later (when subtracting from vcpu threads sched_info.run_delay). To avoid that, move steal time accumulation to vcpu entry time, before copying steal time data to guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Joe Thornber authored
commit 2871c69e upstream. Commit 4c7e3093 ("dm btree remove: fix bug in redistribute3") wasn't a complete fix for redistribute3(). The redistribute3 function takes 3 btree nodes and shares out the entries evenly between them. If the three nodes in total contained (MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in the center. Fix this issue by being more careful about calculating the target number of entries for the left and right nodes. Unit tested in userspace using this program: https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.cSigned-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
John Youn authored
commit c4509601 upstream. The assignement of EP transfer resources was not handled properly in the dwc3 driver. Commit aebda618 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE") previously fixed one aspect of this where resources may be exhausted with multiple calls to SET_INTERFACE. However, it introduced an issue where composite devices with multiple interfaces can be assigned the same transfer resources for different endpoints. This patch solves both issues. The assignment of transfer resources cannot perfectly follow the data book due to the fact that the controller driver does not have all knowledge of the configuration in advance. It is given this information piecemeal by the composite gadget framework after every SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook programming model in this scenario can cause errors. For two reasons: 1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION and SET_INTERFACE (8.1.5). This is incorrect in the scenario of multiple interfaces. 2) The databook does not mention doing more DEPXFERCFG for new endpoint on alt setting (8.1.6). The following simplified method is used instead: All hardware endpoints can be assigned a transfer resource and this setting will stay persistent until either a core reset or hibernation. So whenever we do a DEPSTARTCFG(0) we can go ahead and do DEPXFERCFG for every hardware endpoint as well. We are guaranteed that there are as many transfer resources as endpoints. This patch triggers off of the calling dwc3_gadget_start_config() for EP0-out, which always happens first, and which should only happen in one of the above conditions. Fixes: aebda618 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE") Reported-by: Ravi Babu <ravibabu@ti.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Anssi Hannula authored
commit 42e3121d upstream. AudioQuest DragonFly DAC reports a volume control range of 0..50 (0x0000..0x0032) which in USB Audio means a range of 0 .. 0.2dB, which is obviously incorrect and would cause software using the dB information in e.g. volume sliders to have a massive volume difference in 100..102% range. Commit 2d1cb7f6 ("ALSA: usb-audio: add dB range mapping for some devices") added a dB range mapping for it with range 0..50 dB. However, the actual volume mapping seems to be neither linear volume nor linear dB scale, but instead quite close to the cubic mapping e.g. alsamixer uses, with a range of approx. -53...0 dB. Replace the previous quirk with a custom dB mapping based on some basic output measurements, using a 10-item range TLV (which will still fit in alsa-lib MAX_TLV_RANGE_SIZE). Tested on AudioQuest DragonFly HW v1.2. The quirk is only applied if the range is 0..50, so if this gets fixed/changed in later HW revisions it will no longer be applied. v2: incorporated Takashi Iwai's suggestion for the quirk application method Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi> Signed-off-by: Takashi Iwai <tiwai@suse.de> [lizf: Backoported to 3.4: use dev_info() instead of usb_audio_info()] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Clemens Ladisch authored
commit bf1d1c9b upstream. Add a DECLARE_TLV_DB_RANGE() macro so that dB range information can be specified without having to count the items manually for TLV_DB_RANGE_HEAD(). Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Clemens Ladisch authored
commit b5b9eb54 upstream. Add helper macros with a little bit of preprocessor magic to automatically compute the length of a TLV item. This lets us avoid having to compute this by hand, and will allow to use items that do not use a fixed length. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Jan Beulich authored
commit 0d430e3f upstream. This was meant to print base address and entry count; make it do so again. Fixes: 37868fe1 "x86/ldt: Make modify_ldt synchronous" Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andy Lutomirski <luto@kernel.org> Link: http://lkml.kernel.org/r/56797D8402000078000C24F0@prv-mh.provo.novell.comSigned-off-by: Thomas Gleixner <tglx@linutronix.de> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Rainer Weikusat authored
commit a5527dda upstream. The unix_dgram_sendmsg routine use the following test if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) { to determine if sk and other are in an n:1 association (either established via connect or by using sendto to send messages to an unrelated socket identified by address). This isn't correct as the specified address could have been bound to the sending socket itself or because this socket could have been connected to itself by the time of the unix_peer_get but disconnected before the unix_state_lock(other). In both cases, the if-block would be entered despite other == sk which might either block the sender unintentionally or lead to trying to unlock the same spin lock twice for a non-blocking send. Add a other != sk check to guard against this. Fixes: 7d267278 ("unix: avoid use-after-free in ep_remove_wait_queue") Reported-By: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Tested-by: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Hannes Frederic Sowa authored
commit 7bbadd2d upstream. Docbook does not like the definition of macros inside a field declaration and adds a warning. Move the definition out. Fixes: 79462ad0 ("net: add validation for the socket syscall protocol argument") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-
Ben Zhang authored
commit 62572e29 upstream. I ran into a scenario where while one cpu was stuck and should have panic'd because of the NMI watchdog, it didn't. The reason was another cpu was spewing stack dumps on to the console. Upon investigation, I noticed that when writing to the console and also when dumping the stack, the watchdog is touched. This causes all the cpus to reset their NMI watchdog flags and the 'stuck' cpu just spins forever. This change causes the semantics of touch_nmi_watchdog to be changed slightly. Previously, I accidentally changed the semantics and we noticed there was a codepath in which touch_nmi_watchdog could be touched from a preemtible area. That caused a BUG() to happen when CONFIG_DEBUG_PREEMPT was enabled. I believe it was the acpi code. My attempt here re-introduces the change to have the touch_nmi_watchdog() code only touch the local cpu instead of all of the cpus. But instead of using __get_cpu_var(), I use the __raw_get_cpu_var() version. This avoids the preemption problem. However my reasoning wasn't because I was trying to be lazy. Instead I rationalized it as, well if preemption is enabled then interrupts should be enabled to and the NMI watchdog will have no reason to trigger. So it won't matter if the wrong cpu is touched because the percpu interrupt counters the NMI watchdog uses should still be incrementing. Don said: : I'm ok with this patch, though it does alter the behaviour of how : touch_nmi_watchdog works. For the most part I don't think most callers : need to touch all of the watchdogs (on each cpu). Perhaps a corner case : will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ). : : But this does address an issue where if a system is locked up and one cpu : is spewing out useful debug messages (or error messages), the hard lockup : will fail to go off. We have seen this on RHEL also. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Ben Zhang <benzh@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li <lizefan@huawei.com>
-