Commits · 09a58f536436efed02ead722e835cb4ce7674afc · Kirill Smelkov / linux

04 Nov, 2011 40 commits

NVMe: Fix bug in error handling · 09a58f53

Matthew Wilcox authored Apr 28, 2011

When an I/O completed with an error, we would call bio_endio twice
(once with -EIO and once with 0).  Found by inspection.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

09a58f53

NVMe: Time out initialisation after a few seconds · 22605f96

Matthew Wilcox authored Apr 19, 2011

THe device reports (in its capability register) how long it will take
to initialise. If that time elapses before the ready bit becomes set,
conclude the device is broken and refuse to initialise it. Log a nice
error message so the user knows why we did nothing.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

22605f96

NVMe: Fix warning in free_irq · aba2080f

Matthew Wilcox authored Mar 27, 2011

We need to clear the affinity mask before calling free_irq()
Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

aba2080f

NVMe: Correct the Controller Configuration settings · 7f53f9d2

Matthew Wilcox authored Mar 22, 2011

The arbitration field was extended by one bit, shifting the shutdown
notification bits by one.  Also, the SQ/CQ entry size was made
configurable for future extensions.
Reported-by: Paul Luse <paul.e.luse@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

7f53f9d2

NVMe: Version 0.5 · 8ef70067
Matthew Wilcox authored Mar 21, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
8ef70067

NVMe: Change the definition of nvme_user_io · 6c7d4945

Matthew Wilcox authored Mar 21, 2011

The read and write commands don't define a 'result', so there's no need
to copy it back to userspace.

Remove the ability of the ioctl to submit commands to a different
namespace; it's just asking for trouble, and the use case I have in mind
will be addressed througha  different ioctl in the future.  That removes
the need for both the block_shift and nsid arguments.

Check that the opcode is one of 'read' or 'write'.  Future opcodes may
be added in the future, but we will need a different structure definition
for them.

The nblocks field is redefined to be 0-based.  This allows the user to
request the full 65536 blocks.

Don't byteswap the reftag, apptag and appmask.  Martin Petersen tells
me these are calculated in big-endian and are transmitted to the device
in big-endian.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

6c7d4945

NVMe: Correct the definitions of two ioctls · 9d4af1b7

Matthew Wilcox authored Mar 20, 2011

NVME_IOCTL_SUBMIT_IO has a struct nvme_user_io, not a struct nvme_rw_command
as a parameter, and NVME_IOCTL_DOWNLOAD_FW is a Write, not a Read.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

9d4af1b7

NVMe: Add compat_ioctl · 49481682

Matthew Wilcox authored Mar 19, 2011

Make ioctls work for 32-bit applications on 64-bit kernels.  The structures
are defined to be the same for both 32- and 64-bit applications, so
we can use the same handler for both.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

49481682

NVMe: Simplify queue lookup · 9ecdc946

Matthew Wilcox authored Mar 16, 2011

Fill in all the num_possible_cpus() entries with duplicate pointers.
This reduces the complexity of the frequently-called get_nvmeq(), as
well as avoiding a bug in it when there are fewer queues than CPUs.
Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

9ecdc946

NVMe: Remove the kthread from the wait queue · 3cb967c0

Matthew Wilcox authored Mar 16, 2011

Once there are no more bios on the congestion list, we can stop waking
up the nvme kthread every time a completion happens.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

3cb967c0

NVMe: Fix off-by-one when filling in PRP lists · 7523d834

Matthew Wilcox authored Mar 16, 2011

If the last element in the PRP list fits on the end of the page, there's
no need to allocate an extra page to put that single element in.  It can
fit on the end of the page.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

7523d834

NVMe: Fix interpretation of 'Number of Namespaces' field · ac88c36a

Matthew Wilcox authored Mar 16, 2011

The spec says this is a 0s based value. We don't need to handle the
maximal value because it's reserved to mean "every namespace".
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

ac88c36a

NVMe: Remove outdated comments · 19e899b2

Matthew Wilcox authored Mar 16, 2011

The head can never overrun the tail since we won't allocate enough command
IDs to let that happen. The status codes are in sync with the spec.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

19e899b2

NVMe: Fix comment formatting · fa922821

Matthew Wilcox authored Mar 16, 2011

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

fa922821

NVMe: Convert comments to kernel-doc notation · 714a7a22

Matthew Wilcox authored Mar 16, 2011

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

714a7a22

NVMe: Update admin opcodes to match the 1.0RC spec · 2ddc4f74

Krzysztof Wierzbicki authored Feb 28, 2011

Signed-off-by: Krzysztof Wierzbicki <krzysztof.wierzbicki@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

2ddc4f74

NVMe: Version 0.4 · b57ab0fa
Matthew Wilcox authored Feb 24, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
b57ab0fa

NVMe: Reduce maximum queue depth by 1 · e6d15f79

Matthew Wilcox authored Feb 24, 2011

The spec says we're not allowed to completely fill the submission queue.
Solve this by reducing the number of allocatable cmdids by 1.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

e6d15f79

NVMe: Fix discontiguous accesses · d8ee9d69

Matthew Wilcox authored Feb 24, 2011

When we submit subsequent portions of the I/O, we need to access the
updated block, not start reading again from the original position.
This was showing up as miscompares in the XFS randholes testcase.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

d8ee9d69

NVMe: Handle bios that contain non-virtually contiguous addresses · 1ad2f893

Matthew Wilcox authored Feb 23, 2011

NVMe scatterlists must be virtually contiguous, like almost all I/Os.
However, when the filesystem lays out files with a hole, it can be that
adjacent LBAs map to non-adjacent virtual addresses.  Handle this by
submitting one NVMe command at a time for each virtually discontiguous
range.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

1ad2f893

NVMe: Implement Flush · 00df5cb4

Matthew Wilcox authored Feb 22, 2011

Linux implements Flush as a bit in the bio. That means there may also be
data associated with the flush; if so the flush should be sent before the
data. To avoid completing the bio twice, I add CMD_CTX_FLUSH to indicate
the completion routine should do nothing.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

00df5cb4

NVMe: Mark CMD_CTX_CANCELLED as being unlikely · c4270559
Matthew Wilcox authored Feb 22, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
c4270559

NVMe: Correct SQ doorbell semantics · 7547881d

Matthew Wilcox authored Feb 16, 2011

The value written to the doorbell needs to be the first free index in
the queue, not the most recently used index in the queue.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

7547881d

NVMe: Let the kthread take care of devices earlier · 740216fc

Matthew Wilcox authored Feb 15, 2011

If interrupts are misconfigured, the kthread will be needed to process
admin queue completions.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

740216fc

NVMe: Rename nr_queues to nr_io_queues · b348b7d5

Matthew Wilcox authored Feb 15, 2011

I got confused about whether this included the admin queue or not, and
had to resort to reading the spec.  It doesn't include the admin queue,
so make that clear in the name.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

b348b7d5

NVMe: Remove setting of 'flags' in rw command · ca161542

Matthew Wilcox authored Feb 15, 2011

This was the data transfer bit until spec rev 0.92
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

ca161542

NVMe: Release 0.3 · ad8a5df9
Matthew Wilcox authored Feb 14, 2011
```
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
```
ad8a5df9

NVMe: Add a kthread to handle the congestion list · 1fa6aead

Matthew Wilcox authored Mar 02, 2011

Instead of trying to resubmit I/Os in the I/O completion path (in
interrupt context), wake up a kthread which will resubmit I/O from
user context. This allows mke2fs to run to completion.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

1fa6aead

NVMe: Handle failures differently in nvme_submit_bio_queue() · eeee3226

Matthew Wilcox authored Feb 14, 2011

Return -EBUSY if the queue is full or -ENOMEM if we failed to allocate
memory (or map a scatterlist).  Also use GFP_ATOMIC to allocate the
nvme_bio and move the locking to the callers of nvme_submit_bio_queue().

In nvme_make_request(), don't permit an I/O to jump the queue -- if the
congestion list already has an entry, just add to the tail, rather than
trying to submit.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

eeee3226

NVMe: Update BAR structure to match the current spec · 897cfe1c

Matthew Wilcox authored Feb 14, 2011

Add two reserved registers in the middle of the BAR to match the 1.0
spec plus ECN 0002.

Also rename IMC and ISC to INTMC and INTSC to conform with the spec.
We still don't need to use them :-)
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

897cfe1c

NVMe: Handle physical merging of bvec entries · 76830840

Matthew Wilcox authored Feb 10, 2011

In order to not overrun the sg array, we have to merge physically
contiguous pages into a single sg entry.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

76830840

NVMe: Check for DMA mapping failure · 1974b1ae

Matthew Wilcox authored Feb 10, 2011

If dma_map_sg returns 0 (failure), we need to fail the I/O.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

1974b1ae

NVMe: Pass the nvme_dev to nvme_free_prps and nvme_setup_prps · d567760c

Matthew Wilcox authored Feb 10, 2011

We were passing the nvme_queue to access the q_dmadev for the
dma_alloc_coherent calls, but since we moved to the dma pool API,
we really only need the nvme_dev.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

d567760c

NVMe: Optimise memory usage for I/Os between 4k and 128k · 99802a7a

Matthew Wilcox authored Feb 10, 2011

Add a second memory pool for smaller I/Os. We can pack 16 of these on a
single page instead of using an entire page for each one.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

99802a7a

NVMe: Switch to use DMA Pool API · 091b6092

Matthew Wilcox authored Feb 10, 2011

Calling dma_free_coherent from interrupt context causes warnings.
Using the DMA pools delays freeing until pool destruction, so avoids
the problem.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

091b6092

NVMe: Rename nvme_req_info to nvme_bio · d534df3c

Matthew Wilcox authored Feb 10, 2011

There are too many things called 'info' in this driver.  This data
structure is auxiliary information for a struct bio, so call it nvme_bio,
or nbio when used as a variable.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

d534df3c

NVMe: Initial PRP List support · e025344c

Shane Michael Matthews authored Feb 10, 2011

Add a pointer to the nvme_req_info to hold a new data structure
(nvme_prps) which contains a list of the pages allocated to this
particular request for holding PRP list entries.  nvme_setup_prps()
now returns this pointer.

To allocate and free the memory used for PRP lists, we need a struct
device, so we need to pass the nvme_queue pointer to many functions
which didn't use to need it.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

e025344c

NVMe: Advance the sg pointer when filling in an sg list · 51882d00

Matthew Wilcox authored Feb 10, 2011

For multipage BIOs, we were always using sg[0] instead of advancing
through the list.  Oops :-)
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

51882d00

NVMe: Renumber the special context values · d2d87034

Matthew Wilcox authored Feb 07, 2011

If POISON_POINTER_DELTA isn't defined, ensure they're in page 0 which
should never be mapped.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

d2d87034

NVMe: Handle the congestion list a little better · 9294bbed

Matthew Wilcox authored Feb 07, 2011

In the bio completion handler, check for bios on the congestion list
for this NVM queue. Also, lock the congestion list in the make_request
function as the queue may end up being shared between multiple CPUs.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>

9294bbed