- 28 Aug, 2017 36 commits
-
-
Max Gurtovoy authored
This patch slightly improves performance (mainly for small block sizes). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Martin Wilck authored
This changes the earlier patch "nvmet: don't report 0-bytes in serial number" to use the memcpy_and_pad() helper introduced in a previous patch. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Martin Wilck authored
This helper function is useful for the nvme subsystem, and maybe others. Note: the warnings reported by the kbuild test robot for this patch are actually generated by the use of CONFIG_PROFILE_ALL_BRANCHES together with __FORTIFY_INLINE. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
James Smart authored
The existing nvmet_fc sg list handling has 2 faults: a) the request between LLDD and transport has too large of an sg list (256 elements), which is normally 256k (64 elements). b) sglist handling doesn't optimize on the fact that each element is a page. This patch removes the static sg list in the request and uses the dynamic list already present in the nvmet_fc transport. It also simplies the handling of the sg list on multiple sequences to take advantage of the per-page divisions. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
James Smart authored
If the LLDD resets or detaches from an fc port, the LLDD will deregister all remoteports seen by the fc port and deregister the localport associated with the fc port. The teardown of the localport structure will be held off due to reference counting until all the remoteports are removed (and they are held off until all controllers/associations to terminated). Currently, if the fc port is reinit/reattached and registered again as a localport it is treated as an independent entity from the prior localport and all prior remoteports and controllers cannot be revived. They are created as new and separate entities. This patch changes the localport registration to look at the known localports that are waiting to be torndown. If they are the same port based on wwn's, the local port is transitioned out of the teardown state. This allows the remote ports and controller connections to be reestablished and resumed as long as the localport can also be reregistered within the timeout windows. The patch adds a new routine nvme_fc_attach_to_unreg_lport() with the functionality and moves the lport get/put routines to avoid forward references. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Max Gurtovoy authored
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Max Gurtovoy authored
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
Use ctrl->device and lose the func name. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Guan Junxiong authored
This helps users to quickly spot the reason of why connection fails if the hostid is not compliant with the uuid format. Signed-off-by: Guan Junxiong <guanjunxiong@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
To make the nvme_rdma_configure_admin_queue generic in preparation of moving it to common code. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
No need to queue an extra work to indirect controller removal, just call the ctrl remove routine. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
This should pair with nvme_rdma_stop_queue. While this is not a complete inverse, it still pairs up pretty well because in fabrics we don't have a disconnect capsule (yet) but we simply teardown the transport association. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
Give it a name symmetric to nvme_rdma_free_queue. Also pass in the ctrl sqsize+1 and not the opts queue_size. And suppress a superflous failure message. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
If we move the queues from LIVE state, we might as well stop them (drain for rdma). Do it after we stop the request queues to prevent a stray request sneaking in .queue_rq after we stop the queue. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
Make a symmetrical handling with admin queue. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
No need to open-code it. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
We're not supposed to do that. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
Mimic the pci driver as a controller disable might be more lightweight than a shutdown. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
We always pair tagset allocation with rdma device reference and it shares some code, centralize it with an argument if its an admin or IO tagset. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
Will be used when we centralize control flows. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Sagi Grimberg authored
We will call it from other places so avoid having to forward declare it. Also move it next to nvme_rdma_destroy_admin_queue. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Johannes Thumshirn authored
NVME_RDMA_MAX_SEGMENT_SIZE is not used anywhere, zap it. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Johannes Thumshirn authored
ALL_OPTS isn't used anywhere, remove it. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Guan Junxiong authored
nvmf target shall return NVME_SC_CONNECT_INVALID_HOST instead of the gereal code INVALID_PARAM when the given host nqn is not allowed to connect. Refer to the 2.2.1 section of the NVMe over Fabrics Spec. Signed-off-by: Guan Junxiong <guanjunxiong@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Christoph Hellwig authored
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
-
Jon Derrick authored
NVME's Timestamp feature allows controllers to be aware of the epoch time in milliseconds. This patch adds the set features hook for various transports through the identify path, so that resets and resumes can update the controller as necessary. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> [hch: rebased on top of nvme-4.13 error handling changes, changed nvme_configure_timestamp to return the status] Signed-off-by: Christoph Hellwig <hch@lst.de>
-
Arnav Dawn authored
Define the constant "0xffffffff" (used as nsid for all namespaces) as NVME_NSID_ALL. Signed-off-by: Arnav Dawn <a.dawn@samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
-
Arnav Dawn authored
This patch adds support for handling Fw activation without reset On completion of FW-activation-starting AER, all queues are paused till CSTS.PP is cleared or timed out (exceeds max time for fw activtion MTFA). If device fails to clear CSTS.PP within MTFA, driver issues reset controller. Signed-off-by: Arnav Dawn <a.dawn@samsung.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
-
Jens Axboe authored
Linux 4.13-rc7 Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
David Jeffery authored
There is a race between changing I/O elevator and request_queue removal which can trigger the warning in kobject_add_internal. A program can use sysfs to request a change of elevator at the same time another task is unregistering the request_queue the elevator would be attached to. The elevator's kobject will then attempt to be connected to the request_queue in the object tree when the request_queue has just been removed from sysfs. This triggers the warning in kobject_add_internal as the request_queue no longer has a sysfs directory: kobject_add_internal failed for iosched (error: -2 parent: queue) ------------[ cut here ]------------ WARNING: CPU: 3 PID: 14075 at lib/kobject.c:244 kobject_add_internal+0x103/0x2d0 To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when changing the elevator and use the request_queue's sysfs_lock to serialize between clearing the flag and the elevator testing the flag. Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
weiping zhang authored
The last parameter "count" never be used in xxx_var_store, convert these functions to void. Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
Pull IOMMU fix from Joerg Roedel: "Another fix, this time in common IOMMU sysfs code. In the conversion from the old iommu sysfs-code to the iommu_device_register interface, I missed to update the release path for the struct device associated with an IOMMU. It freed the 'struct device', which was a pointer before, but is now embedded in another struct. Freeing from the middle of allocated memory had all kinds of nasty side effects when an IOMMU was unplugged. Unfortunatly nobody unplugged and IOMMU until now, so this was not discovered earlier. The fix is to make the 'struct device' a pointer again" * tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix wrong freeing of iommu_device->dev
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds authored
Pull char/misc fix from Greg KH: "Here is a single misc driver fix for 4.13-rc7. It resolves a reported problem in the Android binder driver due to previous patches in 4.13-rc. It's been in linux-next with no reported issues" * tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: ANDROID: binder: fix proc->tsk check.
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging/iio fixes from Greg KH: "Here are few small staging driver fixes, and some more IIO driver fixes for 4.13-rc7. Nothing major, just resolutions for some reported problems. All of these have been in linux-next with no reported problems" * tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: magnetometer: st_magn: remove ihl property for LSM303AGR iio: magnetometer: st_magn: fix status register address for LSM303AGR iio: hid-sensor-trigger: Fix the race with user space powering up sensors iio: trigger: stm32-timer: fix get trigger mode iio: imu: adis16480: Fix acceleration scale factor for adis16480 PATCH] iio: Fix some documentation warnings staging: rtl8188eu: add RNX-N150NUB support Revert "staging: fsl-mc: be consistent when checking strcmp() return" iio: adc: stm32: fix common clock rate iio: adc: ina219: Avoid underflow for sleeping time iio: trigger: stm32-timer: add enable attribute iio: trigger: stm32-timer: fix get/set down count direction iio: trigger: stm32-timer: fix write_raw return value iio: trigger: stm32-timer: fix quadrature mode get routine iio: bmp280: properly initialize device for humidity reading
-
git://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB fixes from Jon Mason: "NTB bug fixes to address an incorrect ntb_mw_count reference in the NTB transport, improperly bringing down the link if SPADs are corrupted, and an out-of-order issue regarding link negotiation and data passing" * tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb: ntb: ntb_test: ensure the link is up before trying to configure the mws ntb: transport shouldn't disable link due to bogus values in SPADs ntb: use correct mw_count function in ntb_tool and ntb_transport
-
- 27 Aug, 2017 3 commits
-
-
Linus Torvalds authored
The "lock_page_killable()" function waits for exclusive access to the page lock bit using the WQ_FLAG_EXCLUSIVE bit in the waitqueue entry set. That means that if it gets woken up, other waiters may have been skipped. That, in turn, means that if it sees the page being unlocked, it *must* take that lock and return success, even if a lethal signal is also pending. So instead of checking for lethal signals first, we need to check for them after we've checked the actual bit that we were waiting for. Even if that might then delay the killing of the process. This matches the order of the old "wait_on_bit_lock()" infrastructure that the page locking used to use (and is still used in a few other areas). Note that if we still return an error after having unsuccessfully tried to acquire the page lock, that is ok: that means that some other thread was able to get ahead of us and lock the page, and when that other thread then unlocks the page, the wakeup event will be repeated. So any other pending waiters will now get properly woken up. Fixes: 62906027 ("mm: add PageWaiters indicating tasks are waiting for a page bit") Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Jan Kara <jack@suse.cz> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Tim Chen and Kan Liang have been battling a customer load that shows extremely long page wakeup lists. The cause seems to be constant NUMA migration of a hot page that is shared across a lot of threads, but the actual root cause for the exact behavior has not been found. Tim has a patch that batches the wait list traversal at wakeup time, so that we at least don't get long uninterruptible cases where we traverse and wake up thousands of processes and get nasty latency spikes. That is likely 4.14 material, but we're still discussing the page waitqueue specific parts of it. In the meantime, I've tried to look at making the page wait queues less expensive, and failing miserably. If you have thousands of threads waiting for the same page, it will be painful. We'll need to try to figure out the NUMA balancing issue some day, in addition to avoiding the excessive spinlock hold times. That said, having tried to rewrite the page wait queues, I can at least fix up some of the braindamage in the current situation. In particular: (a) we don't want to continue walking the page wait list if the bit we're waiting for already got set again (which seems to be one of the patterns of the bad load). That makes no progress and just causes pointless cache pollution chasing the pointers. (b) we don't want to put the non-locking waiters always on the front of the queue, and the locking waiters always on the back. Not only is that unfair, it means that we wake up thousands of reading threads that will just end up being blocked by the writer later anyway. Also add a comment about the layout of 'struct wait_page_key' - there is an external user of it in the cachefiles code that means that it has to match the layout of 'struct wait_bit_key' in the two first members. It so happens to match, because 'struct page *' and 'unsigned long *' end up having the same values simply because the page flags are the first member in struct page. Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Christopher Lameter <cl@linux.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
We have a MAX_LFS_FILESIZE macro that is meant to be filled in by filesystems (and other IO targets) that know they are 64-bit clean and don't have any 32-bit limits in their IO path. It turns out that our 32-bit value for that limit was bogus. On 32-bit, the VM layer is limited by the page cache to only 32-bit index values, but our logic for that was confusing and actually wrong. We used to define that value to (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1) which is actually odd in several ways: it limits the index to 31 bits, and then it limits files so that they can't have data in that last byte of a page that has the highest 31-bit index (ie page index 0x7fffffff). Neither of those limitations make sense. The index is actually the full 32 bit unsigned value, and we can use that whole full page. So the maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG". However, we do wan tto avoid the maximum index, because we have code that iterates over the page indexes, and we don't want that code to overflow. So the maximum size of a file on a 32-bit host should actually be one page less than the full 32-bit index. So the actual limit is ULONG_MAX << PAGE_SHIFT. That means that we will not actually be using the page of that last index (ULONG_MAX), but we can grow a file up to that limit. The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5 volume. It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one byte less), but the actual true VM limit is one page less than 16TiB. This was invisible until commit c2a9737f ("vfs,mm: fix a dead loop in truncate_inode_pages_range()"), which started applying that MAX_LFS_FILESIZE limit to block devices too. NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is actually just the offset type itself (loff_t), which is signed. But for clarity, on 64-bit, just use the maximum signed value, and don't make people have to count the number of 'f' characters in the hex constant. So just use LLONG_MAX for the 64-bit case. That was what the value had been before too, just written out as a hex constant. Fixes: c2a9737f ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") Reported-and-tested-by: Doug Nazar <nazard@nazar.ca> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 26 Aug, 2017 1 commit
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull input fixes from Dmitry Torokhov: - a tweak to the IBM Trackpoint driver that helps recognizing trackpoints on never Lenovo Carbons - a fix to the ALPS driver solving scroll issues on some Dells - yet another ACPI ID has been added to Elan I2C toucpad driver - quieted diagnostic message in soc_button_array driver * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365 Input: trackpoint - add new trackpoint firmware ID Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
-