Commits · 87c8331fcf72e501c3a3c0cdc5c9391ec72f7cf2 · nexedi / linux

19 Feb, 2012 40 commits

[SCSI] libsas: prevent domain rediscovery competing with ata error handling · 87c8331f

Dan Williams authored Nov 17, 2011

libata error handling provides for a timeout for link recovery.  libsas
must not rescan for previously known devices in this interval otherwise
it may remove a device that is simply waiting for its link to recover.
Let libata-eh make the determination of when the link is stable and
prevent libsas (host workqueue) from taking action while this
determination is pending.

Using a mutex (ha->disco_mutex) to flush and disable revalidation while
eh is running requires any discovery action that may block on eh be
moved to its own context outside the lock.  Probing ATA devices
explicitly waits on ata-eh and the cache-flush-io issued during device
removal may also pend awaiting eh completion.  Essentially any rphy
add/remove activity needs to run outside the lock.

This adds two new cleanup states for sas_unregister_domain_devices()
'allocated-but-not-probed', and 'flagged-for-destruction'.  In the
'allocated-but-not-probed' state  dev->rphy points to a rphy that is
known to have not been through a sas_rphy_add() event.  At domain
teardown check if this device is still pending probe and cleanup
accordingly.  Similarly if a device has already been queued for removal
then sas_unregister_domain_devices has nothing to do.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

87c8331f

[SCSI] libsas: convert dev->gone to flags · e139942d

Dan Williams authored Jan 07, 2012

In preparation for adding tracking of another device state "destroy".
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

e139942d

[SCSI] libsas: remove ata_port.lock management duties from lldds · 312d3e56

Dan Williams authored Nov 17, 2011

Each libsas driver (mvsas, pm8001, and isci) has invented a different
method for managing the ap->lock.  The lock is held by the ata
->queuecommand() path.  mvsas drops it prior to acquiring any internal
locks which allows it to hold its internal lock across calls to
task->task_done().  This capability is important as it is the only way
the driver can flush task->task_done() instances to guarantee that it no
longer has any in-flight references to a domain_device at
->lldd_dev_gone() time.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

312d3e56

[SCSI] libsas: introduce sas_drain_work() · b1124cd3

Dan Williams authored Dec 19, 2011

When an lldd invokes ->notify_port_event() it can trigger a chain of libsas
events to:

1/ form the port and find the direct attached device

2/ if the attached device is an expander perform domain discovery

A call to flush_workqueue() will only flush the initial port formation work.
Currently libsas users need to call scsi_flush_work() up to the max depth of
chain (which will grow from 2 to 3 when ata discovery is moved to its own
discovery event). Instead of open coding multiple calls switch to use
drain_workqueue() to flush sas work.

drain_workqueue() does not handle new work submitted during the drain so
libsas needs a bit of infrastructure to hold off unchained work submissions
while a drain is in flight. A lldd ->notify() event is considered 'unchained'
while a sas_discover_event() is 'chained'. As Tejun notes:

"For now, I think it would be best to add private wrapper in libsas to
support deferring unchained work items while draining."
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

b1124cd3

[SCSI] libsas: convert ha->state to flags · f8daa6e6

Dan Williams authored Dec 19, 2011

In preparation for adding new states (SAS_HA_DRAINING, SAS_HA_FROZEN),
convert ha->state into a set of flags.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

f8daa6e6

[SCSI] libsas: replace event locks with atomic bitops · b15ebe0b

Dan Williams authored Nov 17, 2011

The locks only served to make sure the pending event bitmask was updated
consistently.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

b15ebe0b

[SCSI] libsas: fix leak of dev->sata_dev.identify_[packet_]device · 756f173f

Dan Williams authored Nov 17, 2011

These are never freed in the nominal path.  A domain_device has a
different lifetime than a sas_rphy we need a dev->rphy independent way
of identifying sata devices.
Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

756f173f

[SCSI] libsas: fix domain_device leak · 735f7d2f

Dan Williams authored Nov 17, 2011

Arrange for the deallocation of a struct domain_device object when it no
longer has:
1/ any children
2/ references by any scsi_targets
3/ references by a lldd

The comment about domain_device lifetime in
Documentation/scsi/libsas.txt is stale as it appears mainline never had
a version of a struct domain_device that was registered as a kobject.
We now manage domain_device reference counts on behalf of external
agents.
Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

735f7d2f

[SCSI] libsas: kill sas_slave_destroy · 6f4e75a4

Dan Williams authored Nov 17, 2011

Per commit 3e4ec344 "libata: kill ATA_FLAG_DISABLED" needing to set
ATA_DEV_NONE is a holdover from before libsas converted to the
"new-style" ata-eh.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

6f4e75a4

[SCSI] libsas: remove unused ata_task_resp fields · 95ac7fd1

Dan Williams authored Nov 17, 2011

Commit 1e34c838 "[SCSI] libsas: remove spurious sata control register
read/write" removed the routines to fake the presence of the sata
control registers, now remove the unused data structure fields to kill
any remaining confusion.
Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

95ac7fd1

[SCSI] Handle disk devices which can not process medium access commands · 18a4d0a2

Martin K. Petersen authored Feb 09, 2012

We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.

The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.

If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.

Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.

[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

18a4d0a2

[SCSI] mpt2sas: spell "primitive" correctly in function prototype · a78e21dc

Andrew Morton authored Feb 08, 2012

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

a78e21dc

[SCSI] virtio-scsi: SCSI driver for QEMU based virtual machines · 4fe74b1c

Paolo Bonzini authored Feb 05, 2012

The virtio-scsi HBA is the basis of an alternative storage stack
for QEMU-based virtual machines (including KVM).  Compared to
virtio-blk it is more scalable, because it supports many LUNs
on a single PCI slot), more powerful (it more easily supports
passthrough of host devices to the guest) and more easily
extensible (new SCSI features implemented by QEMU should not
require updating the driver in the guest).
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

4fe74b1c

[SCSI] hpsa: add some older controllers to the kdump blacklist · 5a4f934e

Tomas Henzl authored Feb 14, 2012

Some other older controllers also do have problems to perform a kdump.
Adding controllers to this list means that the driver will signal
this non-ability via a resettable flag correctly.
The unsupported list was created after a consultation with HP.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

5a4f934e

[SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent TARGET_ERROR · 47ac56db

Mike Snitzer authored Feb 13, 2012

Permanent target failures are non-retryable and should be classified as
TARGET_ERROR; otherwise dm-multipath will retry an IO request that will
always fail at the target.

A SCSI command that fails with ILLEGAL_REQUEST sense and Additional
sense 0x20, 0x21, 0x24 or 0x26 represents a permanent TARGET_ERROR.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

47ac56db

[SCSI] sd: Make sure provisioning mode is reported correctly · 89730393

Martin K. Petersen authored Feb 13, 2012

The provisioning_mode parameter in sysfs did not get updated in the
SD_LBP_DISABLE case. Make sure the provisioning mode is always set
correctly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

89730393

[SCSI] Ensure discard failure gets treated as a target problem · 66a651aa

Martin K. Petersen authored Feb 13, 2012

The error reported up the stack for a discard failure did not clearly
indicate that the command was processed and subsequently failed by the
target device.

Return -EREMOTEIO so multipathing does not classify this condition as a
path failure.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

66a651aa

[SCSI] mpt2sas: add missing allocation check · c834b1c4

Tomas Henzl authored Feb 13, 2012

The __get_free_pages can fail, so the return value should be checked.
Spotted thanks to Stanislaw.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

c834b1c4

[SCSI] qla4xxx: Update driver version to 5.02.00-k14 · cf3059a1

Vikas Chaudhary authored Feb 13, 2012

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

cf3059a1

[SCSI] qla4xxx: Added ping support · c0b9d3f7

Vikas Chaudhary authored Feb 13, 2012

Added ping support for network connection diagnostics.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

c0b9d3f7

[SCSI] iscsi_transport: Added Ping support · ac20c7bf

Vikas Chaudhary authored Feb 13, 2012

Added ping support for iscsi adapter, application can use this
interface for diagnostic network connection.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

ac20c7bf

[SCSI] qla4xxx: added support for host event · ff884430

Vikas Chaudhary authored Aug 29, 2011

Added support to post kernel host event to application using
netlink interface.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

ff884430

[SCSI] scsi_transport_iscsi: added support for host event · a11e2545

Vikas Chaudhary authored Feb 13, 2012

Added support to post kernel host event to application using
netlink interface.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

a11e2545

[SCSI] qla4xxx: Proper detection of firmware abort error code for ISP82xx · 46801ba6

Vikas Chaudhary authored Feb 13, 2012

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

46801ba6

[SCSI] qla4xxx: Remove un-necessary print statment · badc5b99

Lalit Chandivade authored Feb 13, 2012

On ROM lock acquiring timeout failure, driver spews lot of warning
messages in a for loop, remove the unwanted warning message to reduce
kernel messages clutter.
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

badc5b99

[SCSI] qla4xxx: Modified debug log messages for boot info. · e8fb00e0

Manish Rangankar authored Feb 13, 2012

In some configurations user may not have boot targets configured.
In such cases the debug messages printed out by driver look like
some kind of failure happening. However this could be a valid
case, so modified the messages to appear as warning messages
versus failure messages.
Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

e8fb00e0

[SCSI] qla4xxx: Fix verify boot idx correctly · 20e835b4

Lalit Chandivade authored Feb 13, 2012

qla4xxx_verify_boot_idx can falsely report a DDB to be boot target
if ha->pri_ddb_idx and ha->sec_ddb_idx are not initialized correctly.
What this could cause is if there is DDB entry in FLash at index 0, then
qla4xxx_verify_boot_idx would return wrong result as ha->pri_ddb_idx is not
set correctly. Fixed the qla4xxx_get_boot_info to set the ha->pri_ddb_idx and
ha->sec_ddb_idx correctly.
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

20e835b4

[SCSI] qla4xxx: Fix un-necessary delay on invalid DDB · 981c982c

Lalit Chandivade authored Feb 13, 2012

Fix the un-necessary wait for completion of a sendtarget on an
invalid DDB entry. The state of an invalid DDB entry is 0 (unassigned)

This will also avoid the delays during system boot.
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

981c982c

[SCSI] qla4xxx: Remove unused code · 45857216

Vikas Chaudhary authored Feb 13, 2012

This code initially added for FW debugging, we don't need this
code now so taking it out.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

45857216

[SCSI] libfc: Handle discovery failure during ctlr link down · 00832084

Bhanu Prakash Gollapudi authored Feb 10, 2012

While we wait for GPN_FT response, if the ctlr link goes down, the stack
generates a completion for GPN_FT with error FC_EXCH_CLOSED, and reports a
discovery error. Discovery is not retried in this case, and rightly so.
However, the 'pending' flag stays set, which does not allow subsequent
discovery to succeed as GPN_FT will never be issued. Fix it by clearing the
pending flag when the discovery fails due to GPN_FT failure.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

00832084

[SCSI] libfc: Fix panic in fc_exch_recv · d4042e9c

Bhanu Prakash Gollapudi authored Feb 10, 2012

Adding and removing the host into the zone causes this panic.

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [<ffffffffa0491707>] fc_exch_recv+0xc57/0xe70 [libfc]
Call Trace:
[<ffffffffa050e04b>] bnx2fc_l2_rcv_thread+0x37b/0x430 [bnx2fc]
[<ffffffffa050dcd0>] ? bnx2fc_l2_rcv_thread+0x0/0x430 [bnx2fc]
[<ffffffff81090886>] kthread+0x96/0xa0
[<ffffffff8100c14a>] child_rip+0xa/0x20
[<ffffffff810907f0>] ? kthread+0x0/0xa0
[<ffffffff8100c140>] ? child_rip+0x0/0x20

During fc_exch_reset, the active exchanges are aborted and the exch is deleted.
As part of processing ABTS response, due to 'ep' being NULL, any access to ep in
fc_exch_recv_bls() causes this panic. Fixed to access 'ep' only if non-NULL.
Reviewed-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

d4042e9c

[SCSI] fcoe: Remove reference counting on 'stuct fcoe_interface' · 1a8ef414

Robert Love authored Feb 10, 2012

The reference counting was necessary on these instances
because it was possible for NPIV ports to be destroyed
after the N_Port. A previous patch ensures that all NPIV
ports are destroyed before the N_Port making the need to
track references on the interface unnecessary.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

1a8ef414

[SCSI] fcoe: Do not switch context in vport_delete callback · ccefd23e

Robert Love authored Feb 10, 2012

Currently all port deletion is routed though the FCoE
workqueue (fcoe_wq). When fc_remove_host is called on
an N_Port (for example, from fcoe_destroy) the vports
are queued into a FC Transport workqueue. fc_remove_host
flushes that queue and each vport is passed to fcoe's
fcoe_vport_destroy, which simply queues the associated
fcoe_ports for later deletion. This queue cannot be
flushed within the N_Ports destroy path because of
circular locking issues. The result is that the NPIV
ports are destroyed after the N_Port, which is reverse
of how they are created.

This quirk causes fcoe to keep references on the
fcoe_interface shared by each of these ports (N_Port
and NPIV). Changing the ordering such that NPIV ports
are destroyed before the N_Port will allow us to remove
reference counting on the fcoe_interface instances.

This patch simply allows fcoe_vport_destory to destroy
NPIV ports without deferring them to a workqueue context.
This ensures that when fc_remove_host is called the
NPIV ports will be destroyed first before the N_Port and
allows reference counting on the fcoe's fcoe_interface
to be remove in a later patch.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

ccefd23e

[SCSI] fcoe: Rename out_nomod label to out_putmod · 6f68794c

Robert Love authored Feb 10, 2012

The label implies that it should be called when
there is 'nomod.' I read that to mean that the
module reference 'get' failed. However, it's only
called when the module reference 'get' succeeded.

I think it makes more sense to name the label,
'out_putmod' since it should be called when we
need to 'put' the module reference taken in the
routine before returning.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

6f68794c

[SCSI] fcoe: Allow exposing FDMI attributes via sysfs · 7e5adcfb

Neerav Parikh authored Feb 10, 2012

Allow FDMI attributes to be exposed via the fc_host
class object for the fcoe driver.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

7e5adcfb

[SCSI] libfcoe: Don't KERN_ERR on netdev notification · b99fbf6a

Robert Love authored Feb 10, 2012

This is more of a debug statement. As a KERN_ERR we generate
log entries anytime any netdev goes up or down, so when booting
there are notification log entries for all system interfaces
including 'lo'. This is too much. Let's just log when necessary.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

b99fbf6a

[SCSI] isci: T10 DIF support · 3d2d7525

Dave Jiang authored Feb 10, 2012

This allows the controller to do WRITE_INSERT and READ_STRIP for SAS
disks that support protection information. SAS disks must be formatted
with protection information to use this feature via sg_format.

  sg3_utils-1.32 -- sg_format version 1.19 20110730
  sg_format usage:
  sg_format --format --verbose --pinfo /dev/sda
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

3d2d7525

[SCSI] qla2xxx: Avoid invalid request queue dereference for bad response packets. · a6fe35c0

Arun Easi authored Feb 09, 2012

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

a6fe35c0

[SCSI] qla2xxx: Stop iteration after first failure in *_id functions. · dafdf892

Arun Easi authored Feb 09, 2012

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

dafdf892

[SCSI] qla2xxx: Fix incorrect register access in qla2x00_start_iocbs(). · 98878a16

Arun Easi authored Feb 09, 2012

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>

98878a16