Commits · 2b2c1896871838cdf549442e8ad0264be5fa74e3 · Kirill Smelkov / linux

04 Nov, 2011 40 commits

NVMe: Don't probe namespace 0 · 2b2c1896

Matthew Wilcox authored Oct 07, 2011

ECN 001 documented that namespace 0 is not valid. Sending an Identify
with CNS of 0 and Namespace of 0 is an undefined command.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

2b2c1896

Fix calculation of number of pages in a PRP List · 0d1bc912

Nisheeth Bhat authored Sep 29, 2011

The existing calculation underestimated the number of pages required
as it did not take into account the pointer at the end of each page.
The replacement calculation may overestimate the number of pages required
if the last page in the PRP List is entirely full. By using ->npages
as a counter as we fill in the pages, we ensure that we don't try to
free a page that was never allocated.
Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

0d1bc912

NVMe: Create nvme_identify and nvme_get_features functions · bc5fc7e4

Matthew Wilcox authored Sep 19, 2011

Instead of open-coding calls to nvme_submit_admin_cmd, these
small wrappers are simpler to use (the patch removes 14 lines from
nvme_dev_add() for example).
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

bc5fc7e4

NVMe: Fix memory leak in nvme_dev_add() · 684f5c20

Matthew Wilcox authored Sep 19, 2011

The driver was allocating 8k of memory, then freeing 4k of it.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

684f5c20

NVMe: Fix calls to dma_unmap_sg · d1a490e0

Nisheeth Bhat authored Sep 15, 2011

dma_unmap_sg() must be called with the same 'nents' passed to
dma_map_sg(), not the number returned from dma_map_sg().
Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

d1a490e0

NVMe: Correct sg list setup in nvme_map_user_pages · d0ba1e49

Matthew Wilcox authored Sep 13, 2011

Our SG list was constructed to always fill the entire first page, even
if that was more than the length of the I/O.  This is probably harmless,
but some IOMMUs might do something bad.

Correcting the first call to sg_set_page() made it look a lot closer to
the sg_set_page() in the loop, so fold the first call to sg_set_page()
into the loop.
Reported-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>

d0ba1e49

Fix bug in NVME_IOCTL_SUBMIT_IO · 6413214c

Matthew Wilcox authored Aug 09, 2011

Missing 'break' in the switch statement meant that we'd fall through
to the 'return -EINVAL' case.

6413214c

NVMe: Rework ioctls · 6bbf1acd

Matthew Wilcox authored May 20, 2011

Remove the special-purpose IDENTIFY, GET_RANGE_TYPE, DOWNLOAD_FIRMWARE
and ACTIVATE_FIRMWARE commands.  Replace them with a generic ADMIN_CMD
ioctl that can submit any admin command.

Add a new ID ioctl that returns the namespace ID of the queried device.
It corresponds to the SCSI Idlun ioctl.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

6bbf1acd

NVMe: Add the nvme thread to the wait queue before waking it up · eac623ba

Matthew Wilcox authored May 20, 2011

If the I/O was not completed by a single NVMe command, we add the
bio to the congestion list and wake up the kthread to resubmit it.
But the kthread calls remove_wait_queue() unconditionally, which
will oops if it's not on the wait queue.  So add the kthread to
the wait queue before waking it up.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

eac623ba

NVMe: Return real error from nvme_create_queue · 6f0f5449

Matthew Wilcox authored May 11, 2011

nvme_setup_io_queues() was assuming that a NULL return from
nvme_create_queue() was an out-of-memory error. That's not necessarily
true; the adapter might return -EIO, for example. Change the calling
convention to return an ERR_PTR on failure instead of NULL.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

6f0f5449

NVMe: Version 0.6 · be5e0948
Matthew Wilcox authored May 11, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
be5e0948

NVMe: Add a few calling convention notes · 184d2944

Matthew Wilcox authored May 11, 2011

For the benefit of reviewers, add comments to a few functions describing
their calling context
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

184d2944

NVMe: Handle failures from memory allocations in nvme_setup_prps · b77954cb

Matthew Wilcox authored May 12, 2011

If any of the memory allocations in nvme_setup_prps fail, handle it by
modifying the passed-in data length to reflect the number of bytes we are
actually able to send. Also allow the caller to specify the GFP flags
they need; for user-initiated commands, we can use GFP_KERNEL allocations.

The various callers are updated to handle this possibility; the main
I/O path is already prepared for this possibility (as it may happen
due to nvme_map_bio being unable to map all the segments of the I/O).
The other callers return -ENOMEM instead of doing partial I/Os.
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

b77954cb

NVMe: Use an IDA to allocate minor numbers · 5aff9382

Matthew Wilcox authored May 06, 2011

The current approach of using the namespace ID as the minor number
doesn't work when there are multiple adapters in the machine. Rather
than statically partitioning the number of namespaces between adapters,
dynamically allocate minor numbers to namespaces as they are detected.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

5aff9382

NVMe: Add include of delay.h for msleep · fd63e9ce

Matthew Wilcox authored May 06, 2011

Previously it was being implicitly included through some other header file
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

fd63e9ce

NVMe: Add support for timing out I/Os · 8de05535

Matthew Wilcox authored May 12, 2011

In the kthread, walk the list of outstanding I/Os and check they've not
hit the timeout.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

8de05535

NVMe: Rename cancel_cmdid_data to cancel_cmdid · 21075bde

Matthew Wilcox authored Apr 28, 2011

The trailing '_data' on the end was annoying and inconsistent. Also, make
it actually return the data since this is needed for timing out commands.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

21075bde

NVMe: Fix bug in error handling · 09a58f53

Matthew Wilcox authored Apr 28, 2011

When an I/O completed with an error, we would call bio_endio twice
(once with -EIO and once with 0).  Found by inspection.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

09a58f53

NVMe: Time out initialisation after a few seconds · 22605f96

Matthew Wilcox authored Apr 19, 2011

THe device reports (in its capability register) how long it will take
to initialise. If that time elapses before the ready bit becomes set,
conclude the device is broken and refuse to initialise it. Log a nice
error message so the user knows why we did nothing.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

22605f96

NVMe: Fix warning in free_irq · aba2080f

Matthew Wilcox authored Mar 27, 2011

We need to clear the affinity mask before calling free_irq()
Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

aba2080f

NVMe: Correct the Controller Configuration settings · 7f53f9d2

Matthew Wilcox authored Mar 22, 2011

The arbitration field was extended by one bit, shifting the shutdown
notification bits by one.  Also, the SQ/CQ entry size was made
configurable for future extensions.
Reported-by: Paul Luse <paul.e.luse@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

7f53f9d2

NVMe: Version 0.5 · 8ef70067
Matthew Wilcox authored Mar 21, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
8ef70067

NVMe: Change the definition of nvme_user_io · 6c7d4945

Matthew Wilcox authored Mar 21, 2011

The read and write commands don't define a 'result', so there's no need
to copy it back to userspace.

Remove the ability of the ioctl to submit commands to a different
namespace; it's just asking for trouble, and the use case I have in mind
will be addressed througha  different ioctl in the future.  That removes
the need for both the block_shift and nsid arguments.

Check that the opcode is one of 'read' or 'write'.  Future opcodes may
be added in the future, but we will need a different structure definition
for them.

The nblocks field is redefined to be 0-based.  This allows the user to
request the full 65536 blocks.

Don't byteswap the reftag, apptag and appmask.  Martin Petersen tells
me these are calculated in big-endian and are transmitted to the device
in big-endian.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

6c7d4945

NVMe: Correct the definitions of two ioctls · 9d4af1b7

Matthew Wilcox authored Mar 20, 2011

NVME_IOCTL_SUBMIT_IO has a struct nvme_user_io, not a struct nvme_rw_command
as a parameter, and NVME_IOCTL_DOWNLOAD_FW is a Write, not a Read.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

9d4af1b7

NVMe: Add compat_ioctl · 49481682

Matthew Wilcox authored Mar 19, 2011

Make ioctls work for 32-bit applications on 64-bit kernels.  The structures
are defined to be the same for both 32- and 64-bit applications, so
we can use the same handler for both.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

49481682

NVMe: Simplify queue lookup · 9ecdc946

Matthew Wilcox authored Mar 16, 2011

Fill in all the num_possible_cpus() entries with duplicate pointers.
This reduces the complexity of the frequently-called get_nvmeq(), as
well as avoiding a bug in it when there are fewer queues than CPUs.
Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

9ecdc946

NVMe: Remove the kthread from the wait queue · 3cb967c0

Matthew Wilcox authored Mar 16, 2011

Once there are no more bios on the congestion list, we can stop waking
up the nvme kthread every time a completion happens.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

3cb967c0

NVMe: Fix off-by-one when filling in PRP lists · 7523d834

Matthew Wilcox authored Mar 16, 2011

If the last element in the PRP list fits on the end of the page, there's
no need to allocate an extra page to put that single element in.  It can
fit on the end of the page.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

7523d834

NVMe: Fix interpretation of 'Number of Namespaces' field · ac88c36a

Matthew Wilcox authored Mar 16, 2011

The spec says this is a 0s based value. We don't need to handle the
maximal value because it's reserved to mean "every namespace".
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

ac88c36a

NVMe: Remove outdated comments · 19e899b2

Matthew Wilcox authored Mar 16, 2011

The head can never overrun the tail since we won't allocate enough command
IDs to let that happen. The status codes are in sync with the spec.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

19e899b2

NVMe: Fix comment formatting · fa922821

Matthew Wilcox authored Mar 16, 2011

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

fa922821

NVMe: Convert comments to kernel-doc notation · 714a7a22

Matthew Wilcox authored Mar 16, 2011

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

714a7a22

NVMe: Update admin opcodes to match the 1.0RC spec · 2ddc4f74

Krzysztof Wierzbicki authored Feb 28, 2011

Signed-off-by: Krzysztof Wierzbicki <krzysztof.wierzbicki@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

2ddc4f74

NVMe: Version 0.4 · b57ab0fa
Matthew Wilcox authored Feb 24, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
b57ab0fa

NVMe: Reduce maximum queue depth by 1 · e6d15f79

Matthew Wilcox authored Feb 24, 2011

The spec says we're not allowed to completely fill the submission queue.
Solve this by reducing the number of allocatable cmdids by 1.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

e6d15f79

NVMe: Fix discontiguous accesses · d8ee9d69

Matthew Wilcox authored Feb 24, 2011

When we submit subsequent portions of the I/O, we need to access the
updated block, not start reading again from the original position.
This was showing up as miscompares in the XFS randholes testcase.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

d8ee9d69

NVMe: Handle bios that contain non-virtually contiguous addresses · 1ad2f893

Matthew Wilcox authored Feb 23, 2011

NVMe scatterlists must be virtually contiguous, like almost all I/Os.
However, when the filesystem lays out files with a hole, it can be that
adjacent LBAs map to non-adjacent virtual addresses.  Handle this by
submitting one NVMe command at a time for each virtually discontiguous
range.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

1ad2f893

NVMe: Implement Flush · 00df5cb4

Matthew Wilcox authored Feb 22, 2011

Linux implements Flush as a bit in the bio. That means there may also be
data associated with the flush; if so the flush should be sent before the
data. To avoid completing the bio twice, I add CMD_CTX_FLUSH to indicate
the completion routine should do nothing.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

00df5cb4

NVMe: Mark CMD_CTX_CANCELLED as being unlikely · c4270559
Matthew Wilcox authored Feb 22, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
c4270559

NVMe: Correct SQ doorbell semantics · 7547881d

Matthew Wilcox authored Feb 16, 2011

The value written to the doorbell needs to be the first free index in
the queue, not the most recently used index in the queue.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

7547881d